LGAug 29, 2024
Short-Term Electricity-Load Forecasting by Deep Learning: A Comprehensive SurveyQi Dong, Rubing Huang, Chenhui Cui et al.
Short-Term Electricity-Load Forecasting (STELF) refers to the prediction of the immediate demand (in the next few hours to several days) for the power system. Various external factors, such as weather changes and the emergence of new electricity consumption scenarios, can impact electricity demand, causing load data to fluctuate and become non-linear, which increases the complexity and difficulty of STELF. In the past decade, deep learning has been applied to STELF, modeling and predicting electricity demand with high accuracy, and contributing significantly to the development of STELF. This paper provides a comprehensive survey on deep-learning-based STELF over the past ten years. It examines the entire forecasting process, including data pre-processing, feature extraction, deep-learning modeling and optimization, and results evaluation. This paper also identifies some research challenges and potential research directions to be further investigated in future work.
SENov 10, 2023
TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree TransformationZixiang Xian, Rubing Huang, Dave Towey et al.
Artificial intelligence (AI) has revolutionized software engineering (SE) by enhancing software development efficiency. The advent of pre-trained models (PTMs) leveraging transfer learning has significantly advanced AI for SE. However, existing PTMs that operate on individual code tokens suffer from several limitations: They are costly to train and fine-tune; and they rely heavily on labeled data for fine-tuning on task-specific datasets. In this paper, we present TransformCode, a novel framework that learns code embeddings in a contrastive learning manner. Our framework is encoder-agnostic and language-agnostic, which means that it can leverage any encoder model and handle any programming language. We also propose a novel data-augmentation technique called abstract syntax tree (AST) transformation, which applies syntactic and semantic transformations to the original code snippets, to generate more diverse and robust samples for contrastive learning. Our framework has several advantages over existing methods: (1) It is flexible and adaptable, because it can easily be extended to other downstream tasks that require code representation (such as code-clone detection and classification); (2) it is efficient and scalable, because it does not require a large model or a large amount of training data, and it can support any programming language; (3) it is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives; and (4) it can also adjust the number of encoder parameters based on computing resources. We evaluate our framework on several code-related tasks, and demonstrate its effectiveness and superiority over the state-of-the-art methods such as SourcererCC, Code2vec, and InferCode.
SEMay 18
LLM-Based Static Verification of Code Against Natural-Language Requirements: An Industrial Experience ReportZhi Quan Zhou, Dave Towey, Tsong Yueh Chen
Large language models (LLMs) are increasingly used to generate requirements specifications, design documents, code, and test cases. In contrast, much less attention has been given to a more difficult assurance problem: statically verifying whether implemented code satisfies requirements written in natural language. Conventional static analysis tools are effective at detecting coding defects and known vulnerability patterns, but they cannot determine whether program behavior matches intended business logic. Detecting such defects requires reasoning over the specification rather than the code alone. Software testing can expose some of these mismatches, but its effectiveness depends heavily on test design, executable artifacts, and runtime environments. This article presents a two-stage LLM-based workflow for addressing this challenge in an intelligent-vehicle cybersecurity case study. In the first stage, an AI-based rule miner extracts verifiable rules from natural-language requirements while explicitly identifying ambiguity, self-contradiction, and other non-verifiable statements. In the second stage, an AI-based code auditor checks implementation evidence against the extracted rules. Instead of asking a single LLM to directly verify code against lengthy natural-language specifications, the workflow introduces a structured intermediate representation to reduce hallucination, output variability, limited explainability, and context loss. The resulting approach is a requirement-aware and semantics-aware form of static analysis that complements software testing. By analyzing requirements and source code without requiring compilation, execution, or runtime environments, the method shifts verification and validation activities left in the development lifecycle. This LLM-based static analysis is also a new approach to addressing the test oracle problem.
HCApr 11, 2024
Uncovering the Metaverse within Everyday Environments: a Coarse-to-Fine ApproachLiming Xu, Dave Towey, Andrew P. French et al.
The recent release of the Apple Vision Pro has reignited interest in the metaverse, showcasing the intensified efforts of technology giants in developing platforms and devices to facilitate its growth. As the metaverse continues to proliferate, it is foreseeable that everyday environments will become increasingly saturated with its presence. Consequently, uncovering links to these metaverse items will be a crucial first step to interacting with this new augmented world. In this paper, we address the problem of establishing connections with virtual worlds within everyday environments, especially those that are not readily discernible through direct visual inspection. We introduce a vision-based approach leveraging Artcode visual markers to uncover hidden metaverse links embedded in our ambient surroundings. This approach progressively localises the access points to the metaverse, transitioning from coarse to fine localisation, thus facilitating an exploratory interaction process. Detailed experiments are conducted to study the performance of the proposed approach, demonstrating its effectiveness in Artcode localisation and enabling new interaction opportunities.
CVAug 13, 2025
Topological Structure Description for Artcode Detection Using the Shape of Orientation HistogramLiming Xu, Dave Towey, Andrew P. French et al.
The increasing ubiquity of smartphones and resurgence of VR/AR techniques, it is expected that our everyday environment may soon be decorating with objects connecting with virtual elements. Alerting to the presence of these objects is therefore the first step for motivating follow-up further inspection and triggering digital material attached to the objects. This work studies a special kind of these objects -- Artcodes -- a human-meaningful and machine-readable decorative markers that camouflage themselves with freeform appearance by encoding information into their topology. We formulate this problem of recongising the presence of Artcodes as Artcode proposal detection, a distinct computer vision task that classifies topologically similar but geometrically and semantically different objects as a same class. To deal with this problem, we propose a new feature descriptor, called the shape of orientation histogram, to describe the generic topological structure of an Artcode. We collect datasets and conduct comprehensive experiments to evaluate the performance of the Artcode detection proposer built upon this new feature vector. Our experimental results show the feasibility of the proposed feature vector for representing topological structures and the effectiveness of the system for detecting Artcode proposals. Although this work is an initial attempt to develop a feature-based system for detecting topological objects like Artcodes, it would open up new interaction opportunities and spark potential applications of topological object detection.
SEMar 28, 2025
Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic RelationsYifan Zhang, Dave Towey, Matthew Pike et al.
Context: This paper provides an in-depth examination of the generation and evaluation of Metamorphic Relations (MRs) using GPT models developed by OpenAI, with a particular focus on the capabilities of GPT-4 in software testing environments. Objective: The aim is to examine the quality of MRs produced by GPT-3.5 and GPT-4 for a specific System Under Test (SUT) adopted from an earlier study, and to introduce and apply an improved set of evaluation criteria for a diverse range of SUTs. Method: The initial phase evaluates MRs generated by GPT-3.5 and GPT-4 using criteria from a prior study, followed by an application of an enhanced evaluation framework on MRs created by GPT-4 for a diverse range of nine SUTs, varying from simple programs to complex systems incorporating AI/ML components. A custom-built GPT evaluator, alongside human evaluators, assessed the MRs, enabling a direct comparison between automated and human evaluation methods. Results: The study finds that GPT-4 outperforms GPT-3.5 in generating accurate and useful MRs. With the advanced evaluation criteria, GPT-4 demonstrates a significant ability to produce high-quality MRs across a wide range of SUTs, including complex systems incorporating AI/ML components. Conclusions: GPT-4 exhibits advanced capabilities in generating MRs suitable for various applications. The research underscores the growing potential of AI in software testing, particularly in the generation and evaluation of MRs, and points towards the complementarity of human and AI skills in this domain.
IVMay 30, 2023
Elongated Physiological Structure Segmentation via Spatial and Scale Uncertainty-aware NetworkYinglin Zhang, Ruiling Xi, Huazhu Fu et al.
Robust and accurate segmentation for elongated physiological structures is challenging, especially in the ambiguous region, such as the corneal endothelium microscope image with uneven illumination or the fundus image with disease interference. In this paper, we present a spatial and scale uncertainty-aware network (SSU-Net) that fully uses both spatial and scale uncertainty to highlight ambiguous regions and integrate hierarchical structure contexts. First, we estimate epistemic and aleatoric spatial uncertainty maps using Monte Carlo dropout to approximate Bayesian networks. Based on these spatial uncertainty maps, we propose the gated soft uncertainty-aware (GSUA) module to guide the model to focus on ambiguous regions. Second, we extract the uncertainty under different scales and propose the multi-scale uncertainty-aware (MSUA) fusion module to integrate structure contexts from hierarchical predictions, strengthening the final prediction. Finally, we visualize the uncertainty map of final prediction, providing interpretability for segmentation results. Experiment results show that the SSU-Net performs best on cornea endothelial cell and retinal vessel segmentation tasks. Moreover, compared with counterpart uncertainty-based methods, SSU-Net is more accurate and robust.
SEAug 5, 2021
Using Metamorphic Relations to Verify and Enhance Artcode ClassificationLiming Xu, Dave Towey, Andrew French et al.
Software testing is often hindered where it is impossible or impractical to determine the correctness of the behaviour or output of the software under test (SUT), a situation known as the oracle problem. An example of an area facing the oracle problem is automatic image classification, using machine learning to classify an input image as one of a set of predefined classes. An approach to software testing that alleviates the oracle problem is metamorphic testing (MT). While traditional software testing examines the correctness of individual test cases, MT instead examines the relations amongst multiple executions of test cases and their outputs. These relations are called metamorphic relations (MRs): if an MR is found to be violated, then a fault must exist in the SUT. This paper examines the problem of classifying images containing visually hidden markers called Artcodes, and applies MT to verify and enhance the trained classifiers. This paper further examines two MRs, Separation and Occlusion, and reports on their capability in verifying the image classification using one-way analysis of variance (ANOVA) in conjunction with three other statistical analysis methods: t-test (for unequal variances), Kruskal-Wallis test, and Dunnett's test. In addition to our previously-studied classifier, that used Random Forests, we introduce a new classifier that uses a support vector machine, and present its MR-augmented version. Experimental evaluations across a number of performance metrics show that the augmented classifiers can achieve better performance than non-augmented classifiers. This paper also analyses how the enhanced performance is obtained.
SEMay 13, 2021
VPP-ART: An Efficient Implementation of Fixed-Size-Candidate-Set Adaptive Random Testing using Vantage Point PartitioningRubing Huang, Chenhui Cui, Dave Towey et al.
Adaptive Random Testing (ART) is an enhancement of Random Testing (RT), and aims to improve the RT failure-detection effectiveness by distributing test cases more evenly in the input domain. Many ART algorithms have been proposed, with Fixed-Size-Candidate-Set ART (FSCS-ART) being one of the most effective and popular. FSCS-ART ensures high failure-detection effectiveness by selecting the next test case as the candidate farthest from previously-executed test cases. Although FSCS-ART has good failure-detection effectiveness, it also faces some challenges, including heavy computational overheads. In this paper, we propose an enhanced version of FSCS-ART, Vantage Point Partitioning ART (VPP-ART). VPP-ART addresses the FSCS-ART computational overhead problem using vantage point partitioning, while maintaining the failure-detection effectiveness. VPP-ART partitions the input domain space using a modified Vantage Point tree (VP-tree) and finds the approximate nearest executed test cases of a candidate test case in the partitioned sub-domains -- thereby significantly reducing the time overheads compared with the searches required for FSCS-ART. To enable the FSCS-ART dynamic insertion process, we modify the traditional VP-tree to support dynamic data. The simulation results show that VPP-ART has a much lower time overhead compared to FSCS-ART, but also delivers similar (or better) failure-detection effectiveness, especially in the higher dimensional input domains. According to statistical analyses, VPP-ART can improve on the FSCS-ART failure-detection effectiveness by approximately 50% to 58%. VPP-ART also compares favorably with the KDFC-ART algorithms (a series of enhanced ART algorithms based on the KD-tree). Our experiments also show that VPP-ART is more cost-effective than FSCS-ART and KDFC-ART.
SEMay 12, 2021
SWFC-ART: A Cost-effective Approach for Fixed-Size-Candidate-Set Adaptive Random Testing through Small World GraphsMuhammad Ashfaq, Rubing Huang, Dave Towey et al.
Adaptive random testing (ART) improves the failure-detection effectiveness of random testing by leveraging properties of the clustering of failure-causing inputs of most faulty programs: ART uses a sampling mechanism that evenly spreads test cases within a software's input domain. The widely-used Fixed-Sized-Candidate-Set ART (FSCS-ART) sampling strategy faces a quadratic time cost, which worsens as the dimensionality of the software input domain increases. In this paper, we propose an approach based on small world graphs that can enhance the computational efficiency of FSCS-ART: SWFC-ART. To efficiently perform nearest neighbor queries for candidate test cases, SWFC-ART incrementally constructs a hierarchical navigable small world graph for previously executed, non-failure-causing test cases. Moreover, SWFC-ART has shown consistency in programs with high dimensional input domains. Our simulation and empirical studies show that SWFC-ART reduces the computational overhead of FSCS-ART from quadratic to log-linear order while maintaining the failure-detection effectiveness of FSCS-ART, and remaining consistent in high dimensional input domains. We recommend using SWFC-ART in practical software testing scenarios, where real-life programs often have high dimensional input domains and low failure rates.
SEJul 8, 2020
A Survey on Adaptive Random TestingRubing Huang, Weifeng Sun, Yinyin Xu et al.
Random testing (RT) is a well-studied testing method that has been widely applied to the testing of many applications, including embedded software systems, SQL database systems, and Android applications. Adaptive random testing (ART) aims to enhance RT's failure-detection ability by more evenly spreading the test cases over the input domain. Since its introduction in 2001, there have been many contributions to the development of ART, including various approaches, implementations, assessment and evaluation methods, and applications. This paper provides a comprehensive survey on ART, classifying techniques, summarizing application areas, and analyzing experimental evaluations. This paper also addresses some misconceptions about ART, and identifies open research challenges to be further investigated in the future work.
SEJul 1, 2020
Regression Test Case Prioritization by Code Combinations CoverageRubing Huang, Quanjun Zhang, Dave Towey et al.
Regression test case prioritization (RTCP) aims to improve the rate of fault detection by executing more important test cases as early as possible. Various RTCP techniques have been proposed based on different coverage criteria. Among them, a majority of techniques leverage code coverage information to guide the prioritization process, with code units being considered individually, and in isolation. In this paper, we propose a new coverage criterion, code combinations coverage, that combines the concepts of code coverage and combination coverage. We apply this coverage criterion to RTCP, as a new prioritization technique, code combinations coverage based prioritization (CCCP). We report on empirical studies conducted to compare the testing effectiveness and efficiency of CCCP with four popular RTCP techniques: total, additional, adaptive random, and search-based test prioritization. The experimental results show that even when the lowest combination strength is assigned, overall, the CCCP fault detection rates are greater than those of the other four prioritization techniques. The CCCP prioritization costs are also found to be comparable to the additional test prioritization technique. Moreover, our results also show that when the combination strength is increased, CCCP provides higher fault detection rates than the state-of-the-art, regardless of the levels of code coverage.