He Jiang

h-index32

24papers

3,414citations

Novelty39%

AI Score28

Ranked #150,189 of 194,257 authors (top 77%)#1,732 in SE (top 57%)

24 Papers

27.4CVMay 27, 2022

A Survey on Long-Tailed Visual Recognition

Lu Yang, He Jiang, Qing Song et al.

The heavy reliance on data is one of the major reasons that currently limit the development of deep learning. Data quality directly dominates the effect of deep learning models, and the long-tailed distribution is one of the factors affecting data quality. The long-tailed phenomenon is prevalent due to the prevalence of power law in nature. In this case, the performance of deep learning models is often dominated by the head classes while the learning of the tail classes is severely underdeveloped. In order to learn adequately for all classes, many researchers have studied and preliminarily addressed the long-tailed problem. In this survey, we focus on the problems caused by long-tailed data distribution, sort out the representative long-tailed visual recognition datasets and summarize some mainstream long-tailed studies. Specifically, we summarize these studies into ten categories from the perspective of representation learning, and outline the highlights and limitations of each category. Besides, we have studied four quantitative metrics for evaluating the imbalance, and suggest using the Gini coefficient to evaluate the long-tailedness of a dataset. Based on the Gini coefficient, we quantitatively study 20 widely-used and large-scale visual datasets proposed in the last decade, and find that the long-tailed phenomenon is widespread and has not been fully studied. Finally, we provide several future directions for the development of long-tailed learning to provide more ideas for readers.

13.8SEMar 19, 2018Code

Automated Localization for Unreproducible Builds

Zhilei Ren, He Jiang, Jifeng Xuan et al.

Reproducibility is the ability of recreating identical binaries under pre-defined build environments. Due to the need of quality assurance and the benefit of better detecting attacks against build environments, the practice of reproducible builds has gained popularity in many open-source software repositories such as Debian and Bitcoin. However, identifying the unreproducible issues remains a labour intensive and time consuming challenge, because of the lacking of information to guide the search and the diversity of the causes that may lead to the unreproducible binaries. In this paper we propose an automated framework called RepLoc to localize the problematic files for unreproducible builds. RepLoc features a query augmentation component that utilizes the information extracted from the build logs, and a heuristic rule-based filtering component that narrows the search scope. By integrating the two components with a weighted file ranking module, RepLoc is able to automatically produce a ranked list of files that are helpful in locating the problematic files for the unreproducible builds. We have implemented a prototype and conducted extensive experiments over 671 real-world unreproducible Debian packages in four different categories. By considering the topmost ranked file only, RepLoc achieves an accuracy rate of 47.09%. If we expand our examination to the top ten ranked files in the list produced by RepLoc, the accuracy rate becomes 79.28%. Considering that there are hundreds of source code, scripts, Makefiles, etc., in a package, RepLoc significantly reduces the scope of localizing problematic files. Moreover, with the help of RepLoc, we successfully identified and fixed six new unreproducible packages from Debian and Guix.

13.5SEApr 16, 2017Code

Towards Effective Bug Triage with Towards Effective Bug Triage with Software Data Reduction Techniques

Jifeng Xuan, He Jiang, Yan Hu et al.

Software companies spend over 45 percent of cost in dealing with software bugs. An inevitable step of fixing bugs is bug triage, which aims to correctly assign a developer to a new bug. To decrease the time cost in manual work, text classification techniques are applied to conduct automatic bug triage. In this paper, we address the problem of data reduction for bug triage, i.e., how to reduce the scale and improve the quality of bug data. We combine instance selection with feature selection to simultaneously reduce data scale on the bug dimension and the word dimension. To determine the order of applying instance selection and feature selection, we extract attributes from historical bug data sets and build a predictive model for a new bug data set. We empirically investigate the performance of data reduction on totally 600,000 bug reports of two large open source projects, namely Eclipse and Mozilla. The results show that our data reduction can effectively reduce the data scale and improve the accuracy of bug triage. Our work provides an approach to leveraging techniques on data processing to form reduced and high-quality bug data in software development and maintenance.

30.3CLOct 9, 2019Code

Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling

Ouyu Lan, Xiao Huang, Bill Yuchen Lin et al.

Sequence labeling is a fundamental framework for various natural language processing problems. Its performance is largely influenced by the annotation quality and quantity in supervised learning scenarios, and obtaining ground truth labels is often costly. In many cases, ground truth labels do not exist, but noisy annotations or annotations from different domains are accessible. In this paper, we propose a novel framework Consensus Network (ConNet) that can be trained on annotations from multiple sources (e.g., crowd annotation, cross-domain data...). It learns individual representation for every source and dynamically aggregates source-specific knowledge by a context-aware attention module. Finally, it leads to a model reflecting the agreement (consensus) among multiple sources. We evaluate the proposed framework in two practical settings of multi-source learning: learning with crowd annotations and unsupervised cross-domain model adaptation. Extensive experimental results show that our model achieves significant improvements over existing methods in both settings. We also demonstrate that the method can apply to various tasks and cope with different encoders.

31.0CLJun 26, 2019Code

Eliciting Knowledge from Experts:Automatic Transcript Parsing for Cognitive Task Analysis

Junyi Du, He Jiang, Jiaming Shen et al.

Cognitive task analysis (CTA) is a type of analysis in applied psychology aimed at eliciting and representing the knowledge and thought processes of domain experts. In CTA, often heavy human labor is involved to parse the interview transcript into structured knowledge (e.g., flowchart for different actions). To reduce human efforts and scale the process, automated CTA transcript parsing is desirable. However, this task has unique challenges as (1) it requires the understanding of long-range context information in conversational text; and (2) the amount of labeled data is limited and indirect---i.e., context-aware, noisy, and low-resource. In this paper, we propose a weakly-supervised information extraction framework for automated CTA transcript parsing. We partition the parsing process into a sequence labeling task and a text span-pair relation extraction task, with distant supervision from human-curated protocol files. To model long-range context information for extracting sentence relations, neighbor sentences are involved as a part of input. Different types of models for capturing context dependency are then applied. We manually annotate real-world CTA transcripts to facilitate the evaluation of the parsing tasks

11.9SEOct 23, 2018Code

Bridging Semantic Gaps between Natural Languages and APIs with Word Embedding

Xiaochen Li, He Jiang, Yasutaka Kamei et al.

Developers increasingly rely on text matching tools to analyze the relation between natural language words and APIs. However, semantic gaps, namely textual mismatches between words and APIs, negatively affect these tools. Previous studies have transformed words or APIs into low-dimensional vectors for matching; however, inaccurate results were obtained due to the failure of modeling words and APIs simultaneously. To resolve this problem, two main challenges are to be addressed: the acquisition of massive words and APIs for mining and the alignment of words and APIs for modeling. Therefore, this study proposes Word2API to effectively estimate relatedness of words and APIs. Word2API collects millions of commonly used words and APIs from code repositories to address the acquisition challenge. Then, a shuffling strategy is used to transform related words and APIs into tuples to address the alignment challenge. Using these tuples, Word2API models words and APIs simultaneously. Word2API outperforms baselines by 10%-49.6% of relatedness estimation in terms of precision and NDCG. Word2API is also effective on solving typical software tasks, e.g., query expansion and API documents linking. A simple system with Word2API-expanded queries recommends up to 21.4% more related APIs for developers. Meanwhile, Word2API improves comparison algorithms by 7.9%-17.4% in linking questions in Question&Answer communities to API documents.

4.9SEOct 5, 2018

Compiler Testing: A Systematic Literature Analysis

Yixuan Tang, Zhilei Ren, Weiqiang Kong et al.

Compilers are widely-used infrastructures in accelerating the software development, and expected to be trustworthy. In the literature, various testing technologies have been proposed to guarantee the quality of compilers. However, there remains an obstacle to comprehensively characterize and understand compiler testing. To overcome this obstacle, we propose a literature analysis framework to gain insights into the compiler testing area. First, we perform an extensive search to construct a dataset related to compiler testing papers. Then, we conduct a bibliometric analysis to analyze the productive authors, the influential papers, and the frequently tested compilers based on our dataset. Finally, we utilize association rules and collaboration networks to mine the authorships and the communities of interests among researchers and keywords. Some valuable results are reported. We find that the USA is the leading country that contains the most influential researchers and institutions. The most active keyword is "random testing". We also find that most researchers have broad interests within small-scale collaborators in the compiler testing area.

4.9SESep 29, 2018

Towards Better Summarizing Bug Reports with Crowdsourcing Elicited Attributes

He Jiang, Xiaochen Li, Zhilei Ren et al.

Recent years have witnessed the growing demands for resolving numerous bug reports in software maintenance. Aiming to reduce the time testers/developers take in perusing bug reports, the task of bug report summarization has attracted a lot of research efforts in the literature. However, no systematic analysis has been conducted on attribute construction which heavily impacts the performance of supervised algorithms for bug report summarization. In this study, we first conduct a survey to reveal the existing methods for attribute construction in mining software repositories. Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). To evaluate the effectiveness of LRCA, we build a series of large scale data sets with 105,177 bug reports. Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.

4.9SEJun 19, 2018

Blockchain in the Eyes of Developers

He Jiang, Dong Liu, Zhilei Ren et al.

The popularity of blockchain technology continues to grow rapidly in both industrial and academic fields. Most studies of blockchain focus on the improvements of security, usability, or efficiency of blockchain protocols, or the applications of blockchain in finance, Internet of Things, or public services. However, few of them could reveal the concerns of front-line developers and the situations of blockchain in practice. In this article, we investigate how developers use and discuss blockchain with a case study of Stack Overflow posts. We find blockchain is a relatively new topic in Stack Overflow but it is rising to popularity. We detect 13 types of questions that developers post in Stack Overflow and identify 45 blockchain relevant entities (e.g., frameworks, libraries, or tools) for building blockchain applications. These findings may help blockchain project communities to know where to improve and help novices to know where to start.

18.2SEMay 13, 2018

Deep Learning in Software Engineering

Xiaochen Li, He Jiang, Zhilei Ren et al.

Recent years, deep learning is increasingly prevalent in the field of Software Engineering (SE). However, many open issues still remain to be investigated. How do researchers integrate deep learning into SE problems? Which SE phases are facilitated by deep learning? Do practitioners benefit from deep learning? The answers help practitioners and researchers develop practical deep learning models for SE tasks. To answer these questions, we conduct a bibliography analysis on 98 research papers in SE that use deep learning techniques. We find that 41 SE tasks in all SE phases have been facilitated by deep learning integrated solutions. In which, 84.7% papers only use standard deep learning models and their variants to solve SE problems. The practicability becomes a concern in utilizing deep learning techniques. How to improve the effectiveness, efficiency, understandability, and testability of deep learning based solutions may attract more SE researchers in the future.

10.1NEApr 16, 2017

A Hybrid ACO Algorithm for the Next Release Problem

He Jiang, Jingyuan Zhang, Jifeng Xuan et al.

In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Similar to traditional ACO algorithms, multiple artificial ants are employed to construct new solutions. During the solution construction phase, both pheromone trails and neighborhood information will be taken to determine the choices of every ant. In addition, a local search (first found hill climbing) is incorporated into HACO to improve the solution quality. Extensively wide experiments on typical NRP test instances show that HACO outperforms the existing algorithms (GRASP and simulated annealing) in terms of both solution uality and running time.

1.7AIApr 16, 2017

Approximating the Backbone in the Weighted Maximum Satisfiability Problem

He Jiang, Jifeng Xuan, Yan Hu

The weighted Maximum Satisfiability problem (weighted MAX-SAT) is a NP-hard problem with numerous applications arising in artificial intelligence. As an efficient tool for heuristic design, the backbone has been applied to heuristics design for many NP-hard problems. In this paper, we investigated the computational complexity for retrieving the backbone in weighted MAX-SAT and developed a new algorithm for solving this problem. We showed that it is intractable to retrieve the full backbone under the assumption that . Moreover, it is intractable to retrieve a fixed fraction of the backbone as well. And then we presented a backbone guided local search (BGLS) with Walksat operator for weighted MAX-SAT. BGLS consists of two phases: the first phase samples the backbone information from local optima and the backbone phase conducts local search under the guideline of backbone. Extensive experimental results on the benchmark showed that BGLS outperforms the existing heuristics in both solution quality and runtime.

7.1SEApr 16, 2017

Approximate Backbone Based Multilevel Algorithm for Next Release Problem

He Jiang, Jifeng Xuan, Zhilei Ren

The next release problem (NRP) aims to effectively select software requirements in order to acquire maximum customer profits. As an NP-hard problem in software requirement engineering, NRP lacks efficient approximate algorithms for large scale instances. The backbone is a new tool for tackling large scale NP-hard problems in recent years. In this paper, we employ the backbone to design high performance approximate algorithms for large scale NRP instances. Firstly we show that it is NP-hard to obtain the backbone of NRP. Then, we illustrate by fitness landscape analysis that the backbone can be well approximated by the shared common parts of local optimal solutions. Therefore, we propose an approximate backbone based multilevel algorithm (ABMA) to solve large scale NRP instances. This algorithm iteratively explores the search spaces by multilevel reductions and refinements. Experimental results demonstrate that ABMA outperforms existing algorithms on large instances in terms of solution quality and running time.

2.9SEApr 16, 2017

A Random Walk Based Algorithm for Structural Test Case Generation

Jifeng Xuan, He Jiang, Zhilei Ren et al.

Structural testing is a significant and expensive process in software development. By converting test data generation into an optimization problem, search-based software testing is one of the key technologies of automated test case generation. Motivated by the success of random walk in solving the satisfiability problem (SAT), we proposed a random walk based algorithm (WalkTest) to solve structural test case generation problem. WalkTest provides a framework, which iteratively calls random walk operator to search the optimal solutions. In order to improve search efficiency, we sorted the test goals with the costs of solutions completely instead of traditional dependence analysis from control flow graph. Experimental results on the condition-decision coverage demonstrated that WalkTest achieves better performance than existing algorithms (random test and tabu search) in terms of running time and coverage rate.

21.1SEApr 16, 2017

Automatic Bug Triage using Semi-Supervised Text Classification

Jifeng Xuan, He Jiang, Zhilei Ren et al.

In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.

12.5SEApr 16, 2017

Solving the Large Scale Next Release Problem with a Backbone Based Multilevel Algorithm

Jifeng Xuan, He Jiang, Zhilei Ren et al.

The Next Release Problem (NRP) aims to optimize customer profits and requirements selection for the software releases. The research on the NRP is restricted by the growing scale of requirements. In this paper, we propose a Backbone based Multilevel Algorithm (BMA) to address the large scale NRP. In contrast to direct solving approaches, BMA employs multilevel reductions to downgrade the problem scale and multilevel refinements to construct the final optimal set of customers. In both reductions and refinements, the backbone is built to fix the common part of the optimal customers. Since it is intractable to extract the backbone in practice, the approximate backbone is employed for the instance reduction while the soft backbone is proposed to augment the backbone application. In the experiments, to cope with the lack of open large requirements databases, we propose a method to extract instances from open bug repositories. Experimental results on 15 classic instances and 24 realistic instances demonstrate that BMA can achieve better solutions on the large scale NRP instances than direct solving approaches. Our work provides a reduction approach for solving large scale problems in search based requirements engineering.

2.9SEApr 16, 2017

Debt-Prone Bugs: Technical Debt in Software Maintenance

Jifeng Xuan, Yan Hu, He Jiang

Fixing bugs is an important phase in software development and maintenance. In practice, the process of bug fixing may conflict with the release schedule. Such confliction leads to a trade-off between software quality and release schedule, which is known as the technical debt metaphor. In this article, we propose the concept of debt-prone bugs to model the technical debt in software maintenance. We identify three types of debt-prone bugs, namely tag bugs, reopened bugs, and duplicate bugs. A case study on Mozilla is conducted to examine the impact of debt-prone bugs in software products. We investigate the correlation between debt-prone bugs and the product quality. For a product under development, we build prediction models based on historical products to predict the time cost of fixing bugs. The result shows that identifying debt-prone bugs can assist in monitoring and improving software quality.

22.5SEApr 16, 2017

Developer Prioritization in Bug Repositories

Jifeng Xuan, He Jiang, Zhilei Ren et al.

Developers build all the software artifacts in development. Existing work has studied the social behavior in software repositories. In one of the most important software repositories, a bug repository, developers create and update bug reports to support software development and maintenance. However, no prior work has considered the priorities of developers in bug repositories. In this paper, we address the problem of the developer prioritization, which aims to rank the contributions of developers. We mainly explore two aspects, namely modeling the developer prioritization in a bug repository and assisting predictive tasks with our model. First, we model how to assign the priorities of developers based on a social network technique. Three problems are investigated, including the developer rankings in products, the evolution over time, and the tolerance of noisy comments. Second, we consider leveraging the developer prioritization to improve three predicted tasks in bug repositories, i.e., bug triage, severity identification, and reopened bug prediction. We empirically investigate the performance of our model and its applications in bug repositories of Eclipse and Mozilla. The results indicate that the developer prioritization can provide the knowledge of developer priorities to assist software tasks, especially the task of bug triage.

8.7SEMar 13, 2017

Towards Training Set Reduction for Bug Triage

Weiqin Zou, Yan Hu, Jifeng Xuan et al.

Bug triage is an important step in the process of bug fixing. The goal of bug triage is to assign a new-coming bug to the correct potential developer. The existing bug triage approaches are based on machine learning algorithms, which build classifiers from the training sets of bug reports. In practice, these approaches suffer from the large-scale and low-quality training sets. In this paper, we propose the training set reduction with both feature selection and instance selection techniques for bug triage. We combine feature selection with instance selection to improve the accuracy of bug triage. The feature selection algorithm, instance selection algorithm Iterative Case Filter, and their combinations are studied in this paper. We evaluate the training set reduction on the bug data of Eclipse. For the training set, 70% words and 50% bug reports are removed after the training set reduction. The experimental results show that the new and small training sets can provide better accuracy than the original one.

5.7HCMar 7, 2017

What Makes a Good App Description?

He Jiang, Hongjing Ma, Zhilei Ren et al.

In the Google Play store, an introduction page is associated with every mobile application (app) for users to acquire its details, including screenshots, description, reviews, etc. However, it remains a challenge to identify what items influence users most when downloading an app. To explore users' perspective, we conduct a survey to inquire about this question. The results of survey suggest that the participants pay most attention to the app description which gives users a quick overview of the app. Although there exist some guidelines about how to write a good app description to attract more downloads, it is hard to define a high quality app description. Meanwhile, there is no tool to evaluate the quality of app description. In this paper, we employ the method of crowdsourcing to extract the attributes that affect the app descriptions' quality. First, we download some app descriptions from Google Play, then invite some participants to rate their quality with the score from one (very poor) to five (very good). The participants are also requested to explain every score's reasons. By analyzing the reasons, we extract the attributes that the participants consider important during evaluating the quality of app descriptions. Finally, we train the supervised learning models on a sample of 100 app descriptions. In our experiments, the support vector machine model obtains up to 62% accuracy. In addition, we find that the permission, the number of paragraphs and the average number of words in one feature play key roles in defining a good app description.

1.7AIMar 6, 2017

Approximate Muscle Guided Beam Search for Three-Index Assignment Problem

He Jiang, Shuwei Zhang, Zhilei Ren et al.

As a well-known NP-hard problem, the Three-Index Assignment Problem (AP3) has attracted lots of research efforts for developing heuristics. However, existing heuristics either obtain less competitive solutions or consume too much time. In this paper, a new heuristic named Approximate Muscle guided Beam Search (AMBS) is developed to achieve a good trade-off between solution quality and running time. By combining the approximate muscle with beam search, the solution space size can be significantly decreased, thus the time for searching the solution can be sharply reduced. Extensive experimental results on the benchmark indicate that the new algorithm is able to obtain solutions with competitive quality and it can be employed on instances with largescale. Work of this paper not only proposes a new efficient heuristic, but also provides a promising method to improve the efficiency of beam search.

13.5SEMar 5, 2017

An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs

He Jiang, Jingxuan Zhang, Zhilei Ren et al.

Developers increasingly rely on API tutorials to facilitate software development. However, it remains a challenging task for them to discover relevant API tutorial fragments explaining unfamiliar APIs. Existing supervised approaches suffer from the heavy burden of manually preparing corpus-specific annotated data and features. In this study, we propose a novel unsupervised approach, namely Fragment Recommender for APIs with PageRank and Topic model (FRAPT). FRAPT can well address two main challenges lying in the task and effectively determine relevant tutorial fragments for APIs. In FRAPT, a Fragment Parser is proposed to identify APIs in tutorial fragments and replace ambiguous pronouns and variables with related ontologies and API names, so as to address the pronoun and variable resolution challenge. Then, a Fragment Filter employs a set of nonexplanatory detection rules to remove non-explanatory fragments, thus address the non-explanatory fragment identification challenge. Finally, two correlation scores are achieved and aggregated to determine relevant fragments for APIs, by applying both topic model and PageRank algorithm to the retained fragments. Extensive experiments over two publicly open tutorial corpora show that, FRAPT improves the state-of-the-art approach by 8.77% and 12.32% respectively in terms of F-Measure. The effectiveness of key components of FRAPT is also validated.

16.9SEMar 2, 2017

What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing

He Jiang, Xiaochen Li, Zijiang Yang et al.

Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.

3.3HCApr 3, 2015

Software for Wearable Devices: Challenges and Opportunities

He Jiang, Xin Chen, Shuwei Zhang et al.

Wearable devices are a new form of mobile computer system that provides exclusive and user-personalized services. Wearable devices bring new issues and challenges to computer science and technology. This paper summarizes the development process and the categories of wearable devices. In addition, we present new key issues arising in aspects of wearable devices, including operating systems, database management system, network communication protocol, application development platform, privacy and security, energy consumption, human-computer interaction, software engineering, and big data.