DBDec 12, 2023
Translating Natural Language Queries to SQL Using the T5 ModelAlbert Wong, Lien Pham, Young Lee et al.
This paper presents the development process of a natural language to SQL model using the T5 model as the basis. The models, developed in August 2022 for an online transaction processing system and a data warehouse, have a 73\% and 84\% exact match accuracy respectively. These models, in conjunction with other work completed in the research project, were implemented for several companies and used successfully on a daily basis. The approach used in the model development could be implemented in a similar fashion for other database environments and with a more powerful pre-trained language model.
TRMay 17, 2023
Short-Term Stock Price Forecasting using exogenous variables and Machine Learning AlgorithmsAlbert Wong, Steven Whang, Emilio Sagre et al.
Creating accurate predictions in the stock market has always been a significant challenge in finance. With the rise of machine learning as the next level in the forecasting area, this research paper compares four machine learning models and their accuracy in forecasting three well-known stocks traded in the NYSE in the short term from March 2020 to May 2022. We deploy, develop, and tune XGBoost, Random Forest, Multi-layer Perceptron, and Support Vector Regression models. We report the models that produce the highest accuracies from our evaluation metrics: RMSE, MAPE, MTT, and MPE. Using a training data set of 240 trading days, we find that XGBoost gives the highest accuracy despite running longer (up to 10 seconds). Results from this study may improve by further tuning the individual parameters or introducing more exogenous variables.
CVJun 4, 2021
Roof Damage Assessment from Automated 3D Building ModelsKenichi Sugihara, Martin Wallace, Kongwen et al.
The 3D building modelling is important in urban planning and related domains that draw upon the content of 3D models of urban scenes. Such 3D models can be used to visualize city images at multiple scales from individual buildings to entire cities prior to and after a change has occurred. This ability is of great importance in day-to-day work and special projects undertaken by planners, geo-designers, and architects. In this research, we implemented a novel approach to 3D building models for such matter, which included the integration of geographic information systems (GIS) and 3D Computer Graphics (3DCG) components that generate 3D house models from building footprints (polygons), and the automated generation of simple and complex roof geometries for rapid roof area damage reporting. These polygons (footprints) are usually orthogonal. A complicated orthogonal polygon can be partitioned into a set of rectangles. The proposed GIS and 3DCG integrated system partitions orthogonal building polygons into a set of rectangles and places rectangular roofs and box-shaped building bodies on these rectangles. Since technicians are drawing these polygons manually with digitizers, depending on aerial photos, not all building polygons are precisely orthogonal. But, when placing a set of boxes as building bodies for creating the buildings, there may be gaps or overlaps between these boxes if building polygons are not precisely orthogonal. In our proposal, after approximately orthogonal building polygons are partitioned and rectified into a set of mutually orthogonal rectangles, each rectangle knows which rectangle is adjacent to and which edge of the rectangle is adjacent to, which will avoid unwanted intersection of windows and doors when building bodies combined.
NIMay 26, 2021
Gamers Private Network Performance Forecasting. From Raw Data to the Data Warehouse with Machine Learning and Neural NetsAlbert Wong, Chun Yin Chiu, Gaétan Hains et al.
Gamers Private Network (GPN) is a client/server technology that guarantees a connection for online video games that is more reliable and lower latency than a standard internet connection. Users of the GPN technology benefit from a stable and high-quality gaming experience for online games, which are hosted and played across the world. After transforming a massive volume of raw networking data collected by WTFast, we have structured the cleaned data into a special-purpose data warehouse and completed the extensive analysis using machine learning and neural nets technologies, and business intelligence tools. These analyses demonstrate the ability to predict and quantify changes in the network and demonstrate the benefits gained from the use of a GPN for users when connected to an online game session.
DCDec 7, 2020
Machine Learning Prediction of Gamer's Private NetworksChris Mazur, Jesse Ayers, Gaetan Hains et al.
The Gamer's Private Network (GPN) is a client/server technology created by WTFast for making the network performance of online games faster and more reliable. GPN s use middle-mile servers and proprietary algorithms to better connect online video-game players to their game's servers across a wide-area network. Online games are a massive entertainment market and network latency is a key aspect of a player's competitive edge. This market means many different approaches to network architecture are implemented by different competing companies and that those architectures are constantly evolving. Ensuring the optimal connection between a client of WTFast and the online game they wish to play is thus an incredibly difficult problem to automate. Using machine learning, we analyzed historical network data from GPN connections to explore the feasibility of network latency prediction which is a key part of optimization. Our next step will be to collect live data (including client/server load, packet and port information and specific game state information) from GPN Minecraft servers and bots. We will use this information in a Reinforcement Learning model along with predictions about latency to alter the clients' and servers' configurations for optimal network performance. These investigations and experiments will improve the quality of service and reliability of GPN systems.
SEJan 31, 2019
Formal methods and software engineering for DL. Security, safety and productivity for DL systems developmentGaetan J. D. R. Hains, Arvid Jakobsson, Youry Khmelevsky
Deep Learning (DL) techniques are now widespread and being integrated into many important systems. Their classification and recognition abilities ensure their relevance for multiple application domains. As machine-learning that relies on training instead of algorithm programming, they offer a high degree of productivity. But they can be vulnerable to attacks and the verification of their correctness is only just emerging as a scientific and engineering possibility. This paper is a major update of a previously-published survey, attempting to cover all recent publications in this area. It also covers an even more recent trend, namely the design of domain-specific languages for producing and training neural nets.