SESep 27, 2024
Tracking Software Security TopicsPhong Minh Vu, Tung Thanh Nguyen
Software security incidents occur everyday and thousands of software security reports are announced each month. Thus, it is difficult for software security researchers, engineers, and other stakeholders to follow software security topics of their interests in real-time. In this paper, we propose, SOSK, a novel tool for this problem. SOSK allows a user to import a collection of software security reports. It pre-processes and extracts the most important keywords from the textual description of the reports. Based on the similarity of embedding vectors of keywords, SOSK can expand and/or refine a keyword set from a much smaller set of user-provided keywords. Thus, SOSK allows users to define any topic of their interests and retrieve security reports relevant to that topic effectively. Our preliminary evaluation shows that SOSK can expand keywords and retrieve reports relevant to user requests.
SEAug 28, 2019
On building an automated responding system for app reviews: What are the characteristics of reviews and their responses?Phong Minh Vu, Tam The Nguyen, Tung Thanh Nguyen
Recent studies showed that the dialogs between app developers and app users on app stores are important to increase user satisfaction and app's overall ratings. However, the large volume of reviews and the limitation of resources discourage app developers from engaging with customers through this channel. One solution to this problem is to develop an Automated Responding System for developers to respond to app reviews in a manner that is most similar to a human response. Toward designing such system, we have conducted an empirical study of the characteristics of mobile apps' reviews and their human-written responses. We found that an app reviews can have multiple fragments at sentence level with different topics and intentions. Similarly, a response also can be divided into multiple fragments with unique intentions to answer certain parts of their review (e.g., complaints, requests, or information seeking). We have also identified several characteristics of review (rating, topics, intentions, quantitative text feature) that can be used to rank review by their priority of need for response. In addition, we identified the degree of re-usability of past responses is based on their context (single app, apps of the same category, and their common features). Last but not least, a responses can be reused in another review if some parts of it can be replaced by a placeholder that is either a named-entity or a hyperlink. Based on those findings, we discuss the implications of developing an Automated Responding System to help mobile apps' developers write the responses for users reviews more effectively.
SEAug 19, 2019
Recommendation of Exception Handling Code in Mobile App DevelopmentTam The Nguyen, Phong Minh Vu, Tung Thanh Nguyen
In modern programming languages, exception handling is an effective mechanism to avoid unexpected runtime errors. Thus, failing to catch and handle exceptions could lead to serious issues like system crashing, resource leaking, or negative end-user experiences. However, writing correct exception handling code is often challenging in mobile app development due to the fast-changing nature of API libraries for mobile apps and the insufficiency of their documentation and source code examples. Our prior study shows that in practice mobile app developers cause many exception-related bugs and still use bad exception handling practices (e.g. catch an exception and do nothing). To address such problems, in this paper, we introduce two novel techniques for recommending correct exception handling code. One technique, XRank, recommends code to catch an exception likely occurring in a code snippet. The other, XHand, recommends correction code for such an occurring exception. We have developed ExAssist, a code recommendation tool for exception handling using XRank and XHand. The empirical evaluation shows that our techniques are highly effective. For example, XRank has top-1 accuracy of 70% and top-3 accuracy of 87%. XHand's results are 89% and 96%, respectively.
SEAug 18, 2019
API Misuse Correction: A Statistical ApproachTam The Nguyen, Phong Minh Vu, Tung Thanh Nguyen
Modern software development relies heavily on Application Programming Interface (API) libraries. However, there are often certain constraints on using API elements in such libraries. Failing to follow such constraints (API misuse) could lead to serious programming errors. Many approaches have been proposed to detect API misuses, but they still have low accuracy and cannot repair the detected misuses. In this paper, we propose SAM, a novel approach to detect and repair API misuses automatically. SAM uses statistical models to describe five factors involving in any API method call: related method calls, exceptions, pre-conditions, post-conditions, and values of arguments. These statistical models are trained from a large repository of high-quality production code. Then, given a piece of code, SAM verifies each of its method calls with the trained statistical models. If a factor has a sufficiently low probability, the corresponding call is considered as an API misuse. SAM performs an optimal search for editing operations to apply on the code until it has no API issue.
SEJul 27, 2015
Learning API Usages from Bytecode: A Statistical ApproachTam The Nguyen, Hung Viet Pham, Phong Minh Vu et al.
When developing mobile apps, programmers rely heavily on standard API frameworks and libraries. However, learning and using those APIs is often challenging due to the fast-changing nature of API frameworks for mobile systems, the complexity of API usages, the insufficiency of documentation, and the unavailability of source code examples. In this paper, we propose a novel approach to learn API usages from bytecode of Android mobile apps. Our core contributions include: i) ARUS, a graph-based representation of API usage scenarios; ii) HAPI, a statistical, generative model of API usages; and iii) three algorithms to extract ARUS from apps' bytecode, to train HAPI based on method call sequences extracted from ARUS, and to recommend method calls in code completion engines using the trained HAPI. Our empirical evaluation suggests that our approach can learn useful API usage models which can provide recommendations with higher levels of accuracy than the baseline n-gram model.
IRMay 18, 2015
Mining User Opinions in Mobile App Reviews: A Keyword-based ApproachPhong Minh Vu, Tam The Nguyen, Hung Viet Pham et al.
User reviews of mobile apps often contain complaints or suggestions which are valuable for app developers to improve user experience and satisfaction. However, due to the large volume and noisy-nature of those reviews, manually analyzing them for useful opinions is inherently challenging. To address this problem, we propose MARK, a keyword-based framework for semi-automated review analysis. MARK allows an analyst describing his interests in one or some mobile apps by a set of keywords. It then finds and lists the reviews most relevant to those keywords for further analysis. It can also draw the trends over time of those keywords and detect their sudden changes, which might indicate the occurrences of serious issues. To help analysts describe their interests more effectively, MARK can automatically extract keywords from raw reviews and rank them by their associations with negative reviews. In addition, based on a vector-based semantic representation of keywords, MARK can divide a large set of keywords into more cohesive subsets, or suggest keywords similar to the selected ones.