Aditya Ghose

h-index27

18papers

794citations

Novelty37%

AI Score30

Ranked #134,747 of 194,257 authors (top 69%)#1,514 in SE (top 50%)

18 Papers

2.1AIJan 26, 2023

Towards Knowledge-Centric Process Mining

Asjad Khan, Arsal Huda, Aditya Ghose et al.

Process analytic approaches play a critical role in supporting the practice of business process management and continuous process improvement by leveraging process-related data to identify performance bottlenecks, extracting insights about reducing costs and optimizing the utilization of available resources. Process analytic techniques often have to contend with real-world settings where available logs are noisy or incomplete. In this paper we present an approach that permits process analytics techniques to deliver value in the face of noisy/incomplete event logs. Our approach leverages knowledge graphs to mitigate the effects of noise in event logs while supporting process analysts in understanding variability associated with event logs.

3.8CRDec 28, 2021Code

Mining and Classifying Privacy and Data Protection Requirements in Issue Reports

Pattaraporn Sangaroonsilp, Hoa Khanh Dam, Morakot Choetkiertikul et al.

Digital and physical footprints are a trail of user activities collected over the use of software applications and systems. As software becomes ubiquitous, protecting user privacy has become challenging. With the increase of user privacy awareness and advent of privacy regulations and policies, there is an emerging need to implement software systems that enhance the protection of personal data processing. However, existing data protection and privacy regulations provide key principles in high-level, making it difficult for software engineers to design and implement privacy-aware systems. In this paper, we develop a taxonomy that provides a comprehensive set of privacy requirements based on four well-established personal data protection regulations and privacy frameworks, the General Data Protection Regulation (GDPR), ISO/IEC 29100, Thailand Personal Data Protection Act (Thailand PDPA) and Asia-Pacific Economic Cooperation (APEC) privacy framework. These requirements are extracted, refined and classified into a level that can be used to map with issue reports. We have also performed a study on how two large open-source software projects (Google Chrome and Moodle) address the privacy requirements in our taxonomy through mining their issue reports. The paper discusses how the collected issues were classified, and presents the findings and insights generated from our study. Mining and classifying privacy requirements in issue reports can help organisations be aware of their state of compliance by identifying privacy requirements that have not been addressed in their software projects. The taxonomy can also trace back to regulations, standards and frameworks that the software projects have not complied with based on the identified privacy requirements.

6.4SEJan 5, 2021Code

A Taxonomy for Mining and Classifying Privacy Requirements in Issue Reports

Pattaraporn Sangaroonsilp, Hoa Khanh Dam, Morakot Choetkiertikul et al.

Context: Digital and physical trails of user activities are collected over the use of software applications and systems. As software becomes ubiquitous, protecting user privacy has become challenging. With the increase of user privacy awareness and advent of privacy regulations and policies, there is an emerging need to implement software systems that enhance the protection of personal data processing. However, existing data protection and privacy regulations provide key principles in high-level, making it difficult for software engineers to design and implement privacy-aware systems. Objective: In this paper, we develop a taxonomy that provides a comprehensive set of privacy requirements based on four well-established personal data protection regulations and privacy frameworks, the General Data Protection Regulation (GDPR), ISO/IEC 29100, Thailand Personal Data Protection Act (Thailand PDPA) and Asia-Pacific Economic Cooperation (APEC) privacy framework. Methods: These requirements are extracted, refined and classified (using the goal-based requirements analysis method) into a level that can be used to map with issue reports. We have also performed a study on how two large open-source software projects (Google Chrome and Moodle) address the privacy requirements in our taxonomy through mining their issue reports. Results: The paper discusses how the collected issues were classified, and presents the findings and insights generated from our study. Conclusion: Mining and classifying privacy requirements in issue reports can help organisations be aware of their state of compliance by identifying privacy requirements that have not been addressed in their software projects. The taxonomy can also trace back to regulations, standards and frameworks that the software projects have not complied with based on the identified privacy requirements.

20.8SEFeb 3, 2018

A deep tree-based model for software defect prediction

Hoa Khanh Dam, Trang Pham, Shien Wee Ng et al.

Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.

28.0SESep 2, 2016Code

A deep learning model for estimating story points

Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran et al.

Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues. Story points are the most common unit of measure used for estimating the effort involved in implementing a user story or resolving an issue. In this paper, we offer for the \emph{first} time a comprehensive dataset for story points-based estimation that contains 23,313 issues from 16 open source projects. We also propose a prediction model for estimating story points based on a novel combination of two powerful deep learning architectures: long short-term memory and recurrent highway network. Our prediction system is \emph{end-to-end} trainable from raw input data to prediction outcomes without any manual feature engineering. An empirical evaluation demonstrates that our approach consistently outperforms three common effort estimation baselines and two alternatives in both Mean Absolute Error and the Standardized Accuracy.

3.3AIFeb 16, 2025

Game-Of-Goals: Using adversarial games to achieve strategic resilience

Aditya Ghose, Asjad Khan

Our objective in this paper is to develop a machinery that makes a given organizational strategic plan resilient to the actions of competitor agents (adverse environmental actions). We assume that we are given a goal tree representing strategic goals (can also be seen business requirements for a software systems) with the assumption that competitor agents are behaving in a maximally adversarial fashion(opposing actions against our sub goals or goals in general). We use game tree search methods (such as minimax) to select an optimal execution strategy(at a given point in time), such that it can maximize our chances of achieving our (high level) strategic goals. Our machinery helps us determine which path to follow(strategy selection) to achieve the best end outcome. This is done by comparing alternative execution strategies available to us via an evaluation function. Our evaluation function is based on the idea that we want to make our execution plans defensible(future-proof) by selecting execution strategies that make us least vulnerable to adversarial actions by the competitor agents. i.e we want to select an execution strategy such that its leaves minimum room(or options) for the adversary to cause impediment/damage to our business goals/plans.

3.6SEDec 28, 2021

On Privacy Weaknesses and Vulnerabilities in Software Systems

Pattaraporn Sangaroonsilp, Hoa Khanh Dam, Aditya Ghose

In this digital era, our privacy is under constant threat as our personal data and traceable online/offline activities are frequently collected, processed and transferred by many software applications. Privacy attacks are often formed by exploiting vulnerabilities found in those software applications. The Common Weakness Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE) systems are currently the main sources that software engineers rely on for understanding and preventing publicly disclosed software vulnerabilities. However, our study on all 922 weaknesses in the CWE and 156,537 vulnerabilities registered in the CVE to date has found a very small coverage of privacy-related vulnerabilities in both systems, only 4.45\% in CWE and 0.1\% in CVE. These also cover only a small number of areas of privacy threats that have been raised in existing privacy software engineering research, privacy regulations and frameworks, and relevant reputable organisations. The actionable insights generated from our study led to the introduction of 11 new common privacy weaknesses to supplement the CWE system, making it become a source for both security and privacy vulnerabilities.

6.4SEMay 5, 2021

Engineering Blockchain Based Software Systems: Foundations, Survey, and Future Directions

Mahdi Fahmideh, John Grundy, Aakash Ahmed et al.

Many scientific and practical areas have shown increasing interest in reaping the benefits of blockchain technology to empower software systems. However, the unique characteristics and requirements associated with Blockchain Based Software (BBS) systems raise new challenges across the development lifecycle that entail an extensive improvement of conventional software engineering. This article presents a systematic literature review of the state-of-the-art in BBS engineering research from a software engineering perspective. We characterize BBS engineering from the theoretical foundations, processes, models, and roles and discuss a rich repertoire of key development activities, principles, challenges, and techniques. The focus and depth of this survey not only gives software engineering practitioners and researchers a consolidated body of knowledge about current BBS development but also underpins a starting point for further research in this field.

3.6SEFeb 4, 2021

Human Values in Software Release Planning

Davoud Mougouei, Aditya Ghose, Hoa Dam et al.

Software products have become an integral part of human lives, and therefore need to account for human values such as privacy, fairness, and equality. Ignoring human values in software development leads to biases and violations of human values: racial biases in recidivism assessment and facial recognition software are well-known examples of such issues. One of the most critical steps in software development is Software Release Planning (SRP), where decisions are made about the presence or absence of the requirements (features) in the software. Such decisions are primarily guided by the economic value of the requirements, ignoring their impacts on a broader range of human values. That may result in ignoring (selecting) requirements that positively (negatively) impact human values, increasing the risk of value breaches in the software. To address this, we have proposed an Integer Programming approach to considering human values in software release planning. In this regard, an Integer Linear Programming (ILP) model has been proposed, that explicitly accounts for human values in finding an "optimal" subset of the requirements. The ILP model exploits the algebraic structure of fuzzy graphs to capture dependencies and conflicts among the values of the requirements.

5.3SEDec 23, 2020

A Framework for Conditional Statement Technical Debt Identification and Description

Abdulaziz Alhefdhi, Hoa Khanh Dam, Yusuf Sulistyo Nugroho et al.

Technical Debt occurs when development teams favour short-term operability over long-term stability. Since this places software maintainability at risk, technical debt requires early attention to avoid paying for accumulated interest. Most of the existing work focuses on detecting technical debt using code comments, known as Self-Admitted Technical Debt (SATD). However, there are many cases where technical debt instances are not explicitly acknowledged but deeply hidden in the code. In this paper, we propose a framework that caters for the absence of SATD comments in code. Our Self-Admitted Technical Debt Identification and Description (SATDID) framework determines if technical debt should be self-admitted for an input code fragment. If that is the case, SATDID will automatically generate the appropriate descriptive SATD comment that can be attached with the code. While our approach is applicable in principle to any type of code fragments, we focus in this study on technical debt hidden in conditional statements, one of the most TD-carrying parts of code. We explore and evaluate different implementations of SATDID. The evaluation results demonstrate the applicability and effectiveness of our framework over multiple benchmarks. Comparing with the results from the benchmarks, our approach provides at least 21.35%, 59.36%, 31.78%, and 583.33% improvements in terms of Precision, Recall, F-1, and Bleu-4 scores, respectively. In addition, we conduct human evaluation to the SATD comments generated by SATDID. In 1-5 and 0-5 scales for Acceptability and Understandability, the total means achieved by our approach are 3.128 and 3.172, respectively.

2.0AIJul 2, 2019

On Conforming and Conflicting Values

Kinzang Chhogyal, Abhaya Nayak, Aditya Ghose et al.

Values are things that are important to us. Actions activate values - they either go against our values or they promote our values. Values themselves can either be conforming or conflicting depending on the action that is taken. In this short paper, we argue that values may be classified as one of two types - conflicting and inherently conflicting values. They are distinguished by the fact that the latter in some sense can be thought of as being independent of actions. This allows us to do two things: i) check whether a set of values is consistent and ii) check whether it is in conflict with other sets of values.

7.5AIMay 31, 2019

A Value-based Trust Assessment Model for Multi-agent Systems

Kinzang Chhogyal, Abhaya Nayak, Aditya Ghose et al.

An agent's assessment of its trust in another agent is commonly taken to be a measure of the reliability/predictability of the latter's actions. It is based on the trustor's past observations of the behaviour of the trustee and requires no knowledge of the inner-workings of the trustee. However, in situations that are new or unfamiliar, past observations are of little help in assessing trust. In such cases, knowledge about the trustee can help. A particular type of knowledge is that of values - things that are important to the trustor and the trustee. In this paper, based on the premise that the more values two agents share, the more they should trust one another, we propose a simple approach to trust assessment between agents based on values, taking into account if agents trust cautiously or boldly, and if they depend on others in carrying out a task.

11.9SEDec 27, 2018

Towards effective AI-powered agile project management

Hoa Khanh Dam, Truyen Tran, John Grundy et al.

The rise of Artificial intelligence (AI) has the potential to significantly transform the practice of project management. Project management has a large socio-technical element with many uncertainties arising from variability in human aspects e.g., customers' needs, developers' performance and team dynamics. AI can assist project managers and team members by automating repetitive, high-volume tasks to enable project analytics for estimation and risk prediction, providing actionable recommendations, and even making decisions. AI is potentially a game changer for project management in helping to accelerate productivity and increase project success rates. In this paper, we propose a framework where AI technologies can be leveraged to offer support for managing agile projects, which have become increasingly popular in the industry.

14.6NEFeb 3, 2018

DeepProcess: Supporting business process execution using a MANN-based recommender system

Asjad Khan, Hung Le, Kien Do et al.

Process-aware Recommender systems can provide critical decision support functionality to aid business process execution by recommending what actions to take next. Based on recent advances in the field of deep learning, we present a novel memory-augmented neural network (MANN) based approach for constructing a process-aware recommender system. We propose a novel network architecture, namely Write-Protected Dual Controller Memory-Augmented Neural Network (DCw-MANN), for building prescriptive models. To evaluate the feasibility and usefulness of our approach, we consider three real-world datasets and show that our approach leads to better performance on several baselines for the task of suffix recommendation and next task prediction.

27.6SEFeb 2, 2018

Explainable Software Analytics

Hoa Khanh Dam, Truyen Tran, Aditya Ghose

Software analytics has been the subject of considerable recent attention but is yet to receive significant industry traction. One of the key reasons is that software practitioners are reluctant to trust predictions produced by the analytics machinery without understanding the rationale for those predictions. While complex models such as deep learning and ensemble methods improve predictive performance, they have limited explainability. In this paper, we argue that making software analytics models explainable to software practitioners is as \emph{important} as achieving accurate predictions. Explainability should therefore be a key measure for evaluating software analytics models. We envision that explainability will be a key driver for developing software analytics models that are useful in practice. We outline a research roadmap for this space, building on social science, explainable artificial intelligence and software engineering.

19.5SEAug 8, 2017

Automatic feature learning for vulnerability prediction

Hoa Khanh Dam, Truyen Tran, Trang Pham et al.

Code flaws or vulnerabilities are prevalent in software systems and can potentially cause a variety of problems including deadlock, information loss, or system failure. A variety of approaches have been developed to try and detect the most likely locations of such code vulnerabilities in large code bases. Most of them rely on manually designing features (e.g. complexity metrics or frequencies of code tokens) that represent the characteristics of the code. However, all suffer from challenges in sufficiently capturing both semantic and syntactic representation of source code, an important capability for building accurate prediction models. In this paper, we describe a new approach, built upon the powerful deep learning Long Short Term Memory model, to automatically learn both semantic and syntactic features in code. Our evaluation on 18 Android applications demonstrates that the prediction power obtained from our learned features is equal or even superior to what is achieved by state of the art vulnerability prediction models: 3%--58% improvement for within-project prediction and 85% for cross-project prediction.

18.2SEJul 30, 2016

DeepSoft: A vision for a deep model of software

Hoa Khanh Dam, Truyen Tran, John Grundy et al.

Although software analytics has experienced rapid growth as a research area, it has not yet reached its full potential for wide industrial adoption. Most of the existing work in software analytics still relies heavily on costly manual feature engineering processes, and they mainly address the traditional classification problems, as opposed to predicting future events. We present a vision for \emph{DeepSoft}, an \emph{end-to-end} generic framework for modeling software and its development process to predict future risks and recommend interventions. DeepSoft, partly inspired by human memory, is built upon the powerful deep learning-based Long Short Term Memory architecture that is capable of learning long-term temporal dependencies that occur in software evolution. Such deep learned patterns of software can be used to address a range of challenging problems such as code and task recommendation and prediction. DeepSoft provides a new approach for research into modeling of source code, risk prediction and mitigation, developer modeling, and automatically generating code patches from bug reports.

6.9SEFeb 25, 2014

Towards rational and minimal change propagation in model evolution

Hoa Khanh Dam, Aditya Ghose

A critical issue in the evolution of software models is change propagation: given a primary change that is made to a model in order to meet a new or changed requirement, what additional secondary changes are needed to maintain consistency within the model, and between the model and other models in the system? In practice, there are many ways of propagating changes to fix a given inconsistency, and how to justify and automate the selection between such change options remains a critical challenge. In this paper, we propose a number of postulates, inspired by the mature belief revision theory, that a change propagation process should satisfy to be considered rational and minimal. Such postulates enable us to reason about selecting alternative change options, and consequently to develop a machinery that automatically performs this task. We further argue that a possible implementation of such a change propagation process can be considered as a classical state space search in which each state represents a snapshot of the model in the process. This view naturally reflects the cascading nature of change propagation, where each change can require further changes to be made.