Danny Ho

SE
21papers
557citations
Novelty23%
AI Score21

21 Papers

SEJul 24, 2015Code
Building an OSS Quality Estimation Model with CATREG

Jie Xu, Danny Ho, Luiz Fernando Capretz

Open Source Software (OSS) has been a popular form in software development. In this paper, we use statistical approaches to derive OSS quality estimation models. Our objective is to build estimation models for the number of defects with metrics at project levels. First CATREG (Categorical regression with optimal scaling) is used to obtain quantifications of the qualitative variables. Then the independent variables are validated using the stepwise linear regression. The process is repeated to acquire optimal quantifications and final regression formula. This modeling process is performed based on data from the OSS communities and is proved to be practically valuable.

SEJul 24, 2015Code
Exploratory Analysis of Quality Practices in Open Source Domain

Jie Xu, Luiz Fernando Capretz, Danny Ho

Software quality assurance has been a heated topic for several decades, but relatively few analyses were performed on open source software (OSS). As OSS has become very popular in our daily life, many researchers have been keen on the quality practices in this area. Although quality management presents distinct patterns compared with those in closed-source software development, some widely used OSS products have been implemented. Therefore, quality assurance of OSS projects has attracted increased research focuses. In this paper, a survey is conducted to reveal the general quality practices in open source communities. Exploratory analysis has been carried out to disclose those quality related activities. The results are compared with those from closed-source environments and the distinguished features of the quality assurance in OSS projects have been confirmed. Moreover, this study suggests potential directions for OSS developers to follow.

STJan 26, 2022
Machine Learning for Stock Prediction Based on Fundamental Analysis

Yuxuan Huang, Luiz Fernando Capretz, Danny Ho

Application of machine learning for stock prediction is attracting a lot of attention in recent years. A large amount of research has been conducted in this area and multiple existing results have shown that machine learning methods could be successfully used toward stock predicting using stocks historical data. Most of these existing approaches have focused on short term prediction using stocks historical price and technical indicators. In this paper, we prepared 22 years worth of stock quarterly financial data and investigated three machine learning algorithms: Feed-forward Neural Network (FNN), Random Forest (RF) and Adaptive Neural Fuzzy Inference System (ANFIS) for stock prediction based on fundamental analysis. In addition, we applied RF based feature selection and bootstrap aggregation in order to improve model performance and aggregate predictions from different models. Our results show that RF model achieves the best prediction results, and feature selection is able to improve test performance of FNN and ANFIS. Moreover, the aggregated model outperforms all baseline models as well as the benchmark DJIA index by an acceptable margin for the test period. Our findings demonstrate that machine learning models could be used to aid fundamental analysts with decision-making regarding stock investment.

SEOct 11, 2021
Automatic Recall of Software Lessons Learned for Software Project Managers

Tamer Mohamed Abdellatif, Luiz Fernando Capretz, Danny Ho

Lessons learned (LL) records constitute the software organization memory of successes and failures. LL are recorded within the organization repository for future reference to optimize planning, gain experience, and elevate market competitiveness. However, manually searching this repository is a daunting task, so it is often disregarded. This can lead to the repetition of previous mistakes or even missing potential opportunities. This, in turn, can negatively affect the profitability and competitiveness of organizations. We aim to present a novel solution that provides an automatic process to recall relevant LL and to push those LL to project managers. This will dramatically save the time and effort of manually searching the unstructured LL repositories and thus encourage the LL exploitation. We exploit existing project artifacts to build the LL search queries on-the-fly in order to bypass the tedious manual searching. An empirical case study is conducted to build the automatic LL recall solution and evaluate its effectiveness. The study employs three of the most popular information retrieval models to construct the solution. Furthermore, a real-world dataset of 212 LL records from 30 different software projects is used for validation. Top-k and MAP well-known accuracy metrics are used as well. Our case study results confirm the effectiveness of the automatic LL recall solution. Also, the results prove the success of using existing project artifacts to dynamically build the search query string. This is supported by a discerning accuracy of about 70% achieved in the case of top-k. The automatic LL recall solution is valid with high accuracy. It will eliminate the effort needed to manually search the LL repository. Therefore, this will positively encourage project managers to reuse the available LL knowledge, which will avoid old pitfalls and unleash hidden business opportunities.

SEMay 22, 2020
Updating Weight Values for Function Point Counting

Wei Xia, Danny Ho, Luiz Fernando Capretz et al.

While software development productivity has grown rapidly, the weight values assigned to count standard Function Point (FP) created at IBM twenty-five years ago have never been updated. This obsolescence raises critical questions about the validity of the weight values; it also creates other problems such as ambiguous classification, crisp boundary, as well as subjective and locally defined weight values. All of these challenges reveal the need to calibrate FP in order to reflect both the specific software application context and the trend of todays software development techniques more accurately. We have created a FP calibration model that incorporates the learning ability of neural networks as well as the capability of capturing human knowledge using fuzzy logic. The empirical validation using ISBSG Data Repository (release 8) shows an average improvement of 22% in the accuracy of software effort estimations with the new calibration.

STJun 12, 2019
Neural Network Models for Stock Selection Based on Fundamental Analysis

Yuxuan Huang, Luiz Fernando Capretz, Danny Ho

Application of neural network architectures for financial prediction has been actively studied in recent years. This paper presents a comparative study that investigates and compares feed-forward neural network (FNN) and adaptive neural fuzzy inference system (ANFIS) on stock prediction using fundamental financial ratios. The study is designed to evaluate the performance of each architecture based on the relative return of the selected portfolios with respect to the benchmark stock index. The results show that both architectures possess the ability to separate winners and losers from a sample universe of stocks, and the selected portfolios outperform the benchmark. Our study argues that FNN shows superior performance over ANFIS.

ROMar 23, 2019
HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning-based Algorithms on Mobile Robots

Tingguang Li, Danny Ho, Chenming Li et al.

As one of the most promising areas, mobile robots draw much attention these years. Current work in this field is often evaluated in a few manually designed scenarios, due to the lack of a common experimental platform. Meanwhile, with the recent development of deep learning techniques, some researchers attempt to apply learning-based methods to mobile robot tasks, which requires a substantial amount of data. To satisfy the underlying demand, in this paper we build HouseExpo, a large-scale indoor layout dataset containing 35,126 2D floor plans including 252,550 rooms in total. Together we develop Pseudo-SLAM, a lightweight and efficient simulation platform to accelerate the data generation procedure, thereby speeding up the training process. In our experiments, we build models to tackle obstacle avoidance and autonomous exploration from a learning perspective in simulation as well as real-world experiments to verify the effectiveness of our simulator and dataset. All the data and codes are available online and we hope HouseExpo and Pseudo-SLAM can feed the need for data and benefits the whole community.

SEDec 4, 2016
Enhancing Use Case Points Estimation Method Using Soft Computing Techniques

Ali Bou Nassif, Luiz Fernando Capretz, Danny Ho

Software estimation is a crucial task in software engineering. Software estimation encompasses cost, effort, schedule, and size. The importance of software estimation becomes critical in the early stages of the software life cycle when the details of software have not been revealed yet. Several commercial and non-commercial tools exist to estimate software in the early stages. Most software effort estimation methods require software size as one of the important metric inputs and consequently, software size estimation in the early stages becomes essential. One of the approaches that has been used for about two decades in the early size and effort estimation is called use case points. Use case points method relies on the use case diagram to estimate the size and effort of software projects. Although the use case points method has been widely used, it has some limitations that might adversely affect the accuracy of estimation. This paper presents some techniques using fuzzy logic and neural networks to improve the accuracy of the use case points method. Results showed that an improvement up to 22% can be obtained using the proposed approach.

SENov 29, 2016
Neural Network Models for Software Development Effort Estimation: A Comparative Study

Ali Bou Nassif, Mohammad Azzeh, Luiz Fernando Capretz et al.

Software development effort estimation (SDEE) is one of the main tasks in software project management. It is crucial for a project manager to efficiently predict the effort or cost of a software project in a bidding process, since overestimation will lead to bidding loss and underestimation will cause the company to lose money. Several SDEE models exist; machine learning models, especially neural network models, are among the most prominent in the field. In this study, four different neural network models: Multilayer Perceptron, General Regression Neural Network, Radial Basis Function Neural Network, and Cascade Correlation Neural Network are compared with each other based on: (1) predictive accuracy centered on the Mean Absolute Error criterion, (2) whether such a model tends to overestimate or underestimate, and (3) how each model classifies the importance of its inputs. Industrial datasets from the International Software Benchmarking Standards Group (ISBSG) are used to train and validate the four models. The main ISBSG dataset was filtered and then divided into five datasets based on the productivity value of each project. Results show that the four models tend to overestimate in 80percent of the datasets, and the significance of the model inputs varies based on the selected model. Furthermore, the Cascade Correlation Neural Network outperforms the other three models in the majority of the datasets constructed on the Mean Absolute Residual criterion.

SEDec 1, 2015
A Hybrid Intelligent Model for Software Cost Estimation

Wei Lin Du, Luiz Fernando Capretz, Ali Bou Nassif et al.

Accurate software development effort estimation is critical to the success of software projects. Although many techniques and algorithmic models have been developed and implemented by practitioners, accurate software development effort prediction is still a challenging endeavor in the field of software engineering, especially in handling uncertain and imprecise inputs and collinear characteristics. In this paper, a hybrid in-telligent model combining a neural network model integrated with fuzzy model (neuro-fuzzy model) has been used to improve the accuracy of estimating software cost. The performance of the proposed model is assessed by designing and conducting evaluation with published project and industrial data. Results have shown that the proposed model demonstrates the ability of improving the estimation accuracy by 18% based on the Mean Magnitude of Relative Error (MMRE) criterion.

SENov 12, 2015
Software Analytics to Software Domains: A Systematic Literature Review

Tamer Mohamed Abdelltif, Luiz Fernando Capretz, Danny Ho

Software Analytics (SA) is a new branch of big data analytics that has recently emerged (2011). What distinguishes SA from direct software analysis is that it links data mined from many different software artifacts to obtain valuable insights. These insights are useful for the decision-making process throughout the different phases of the software lifecycle. Since SA is currently a hot and promising topic, we have conducted a systematic literature review, presented in this paper, to identify gaps in knowledge and open research areas in SA. Because many researchers are still confused about the true potential of SA, we had to filter out available research papers to obtain the most SA-relevant work for our review. This filtration yielded 19 studies out of 135. We have based our systematic review on four main factors: which software practitioners SA targets, which domains are covered by SA, which artifacts are extracted by SA, and whether these artifacts are linked or not. The results of our review have shown that much of the available SA research only serves the needs of developers. Also, much of the available research uses only one artifact which, in turn, means fewer links between artifacts and fewer insights. This shows that the available SA research work is still embryonic leaving plenty of room for future research in the SA field.

SEAug 28, 2015
A Comparison Between Decision Trees and Decision Tree Forest Models for Software Development Effort Estimation

Ali Bou Nassif, Mohammad Azzeh, Luiz Fernando Capretz et al.

Accurate software effort estimation has been a challenge for many software practitioners and project managers. Underestimation leads to disruption in the projects estimated cost and delivery. On the other hand, overestimation causes outbidding and financial losses in business. Many software estimation models exist; however, none have been proven to be the best in all situations. In this paper, a decision tree forest (DTF) model is compared to a traditional decision tree (DT) model, as well as a multiple linear regression model (MLR). The evaluation was conducted using ISBSG and Desharnais industrial datasets. Results show that the DTF model is competitive and can be used as an alternative in software effort prediction.

SEAug 25, 2015
A Neuro-Fuzzy Method to Improving Backfiring Conversion Ratios

Justin Wong, Danny Ho, Luiz Fernando Capretz

Software project estimation is crucial aspect in delivering software on time and on budget. Software size is an important metric in determining the effort, cost, and productivity. Today, source lines of code and function point are the most used sizing metrics. Backfiring is a well-known technique for converting between function points and source lines of code. However when backfiring is used, there is a high margin of error. This study introduces a method to improve the accuracy of backfiring. Intelligent systems have been used in software prediction models to improve performance over traditional techniques. For this reason, a hybrid Neuro-Fuzzy is used because it takes advantages of the neural networks learning and fuzzy logic human-like reasoning. This paper describes an improved backfiring technique which uses Neuro-Fuzzy and compares the new method against the default conversion ratios currently used by software practitioners.

SEJul 31, 2015
Neuro-Fuzzy Algorithmic (NFA) Models and Tools for Estimation

Danny Ho, Luiz Fernando Capretz, Xishi Huang et al.

Accurate estimation such as cost estimation, quality estimation and risk analysis is a major issue in management. We propose a patent pending soft computing framework to tackle this challenging problem. Our generic framework is independent of the nature and type of estimation. It consists of neural network, fuzzy logic, and an algorithmic estimation model. We made use of the Constructive Cost Model (COCOMO), Analysis of Variance (ANOVA), and Function Point Analysis as the algorithmic models and validated the accuracy of the Neuro-Fuzzy Algorithmic (NFA) Model in software cost estimation using industrial project data. Our model produces more accurate estimation than using an algorithmic model alone. We also discuss the prototypes of our tools that implement the NFA Model. We conclude with our roadmap and direction to enrich the model in tackling different estimation challenges.

SEJul 31, 2015
An Intelligent Approach to Software Cost Prediction

Xishi Huang, Luiz Fernando Capretz, Danny Ho et al.

Good software cost prediction is important for effective project management such as budgeting, project planning and control. In this paper, we present an intelligent approach to software cost prediction. By integrating the neuro-fuzzy technique with the well-accepted COCOMO model, our approach can make the best use of both expert knowledge and historical project data. Its major advantages include learning ability, good interpretability, and robustness to imprecise and uncertain inputs. The validation using industry project data shows that the model greatly improves prediction accuracy in comparison with the COCOMO model.

SEJul 31, 2015
A Neuro-Fuzzy Model with SEER-SEM for Software Effort Estimation

Wei Lin Du, Danny Ho, Luiz Fernando Capretz

Software effort estimation is a critical part of software engineering. Although many techniques and algorithmic models have been developed and implemented by practitioners, accurate software effort prediction is still a challenging endeavor. In order to address this issue, a novel soft computing framework was previously developed. Our study utilizes this novel framework to develop an approach combining the neuro-fuzzy technique with the System Evaluation and Estimation of Resource - Software Estimation Model (SEER-SEM). Moreover, our study assesses the performance of the proposed model by designing and conducting evaluation with published industrial project data. After analyzing the performance of our model in comparison to the SEER-SEM effort estimation model alone, the proposed model demonstrates the ability of improving the estimation accuracy, especially in its ability to reduce the large Mean Relative Error (MRE). Furthermore, the results of this research indicate that the general neuro-fuzzy framework can work with various algorithmic models for improving the performance of software effort estimation.

SEJul 31, 2015
Calibrating Function Points Using Neuro-Fuzzy Technique

Wei Xia, Danny Ho, Luiz Fernando Capretz

The concepts of calibrating Function Points are discussed, whose aims are to fit specific software application, to reflect software industry trend, and to improve cost estimation. Neuro-Fuzzy is a technique which incorporates the learning ability from neural network and the ability to capture human knowledge from fuzzy logic. The empirical validation using ISBSG data repository Release 8 shows a 22% improvement in software effort estimation after calibration using Neuro-Fuzzy technique.

SEJul 24, 2015
A Neuro-Fuzzy Model for Function Point Calibration

Wei Xia, Danny Ho, Luiz Fernando Capretz

The need to update the calibration of Function Point (FP) complexity weights is discussed, whose aims are to fit specific software application, to reflect software industry trend, and to improve cost estimation. Neuro-Fuzzy is a technique that incorporates the learning ability from neural network and the ability to capture human knowledge from fuzzy logic. The empirical validation using ISBSG data repository Release 8 shows a 22% improvement in software effort estimation after calibration using Neuro-Fuzzy technique.

SEJul 24, 2015
An Empirical Study on the Procedure to Derive Software Quality Estimation Models

Jie Xu, Danny Ho, Luiz Fernando Capretz

Software quality assurance has been a heated topic for several decades. If factors that influence software quality can be identified, they may provide more insight for better software development management. More precise quality assurance can be achieved by employing resources according to accurate quality estimation at the early stages of a project. In this paper, a general procedure is proposed to derive software quality estimation models and various techniques are presented to accomplish the tasks in respective steps. Several statistical techniques together with machine learning method are utilized to verify the effectiveness of software metrics. Moreover, a neuro-fuzzy approach is adopted to improve the accuracy of the estimation model. This procedure is carried out based on data from the ISBSG repository to present its empirical value.

SEJul 24, 2015
Improving Software Effort Estimation Using Neuro-Fuzzy Model with SEER-SEM

Wei Lin Du, Danny Ho, Luiz Fernando Capretz

The aims of our research are to evaluate the prediction performance of the proposed neuro-fuzzy model with System Evaluation and Estimation of Resource Software Estimation Model (SEER-SEM) in software estimation practices and to apply the proposed architecture that combines the neuro-fuzzy technique with different algorithmic models. In this paper, an approach combining the neuro-fuzzy technique and the SEER-SEM effort estimation algorithm is described. This proposed model possesses positive characteristics such as learning ability, decreased sensitivity, effective generalization, and knowledge integration for introducing the neuro-fuzzy technique. Moreover, continuous rating values and linguistic values can be inputs of the proposed model for avoiding the large estimation deviation among similar projects. The performance of the proposed model is accessed by designing and conducting evaluation with published projects and industrial data. The evaluation results indicate that estimation with our proposed neuro-fuzzy model containing SEER-SEM is improved in comparison with the estimation results that only use SEER-SEM algorithm. At the same time, the results of this research also demonstrate that the general neuro-fuzzy framework can function with various algorithmic models for improving the performance of software effort estimation.

SEMay 6, 2014
Analyzing the Non-Functional Requirements in the Desharnais Dataset for Software Effort Estimation

Ali Bou Nassif, Luiz Fernando Capretz, Danny Ho

Studying the quality requirements (aka Non-Functional Requirements (NFR)) of a system is crucial in Requirements Engineering. Many software projects fail because of neglecting or failing to incorporate the NFR during the software life development cycle. This paper focuses on analyzing the importance of the quality requirements attributes in software effort estimation models based on the Desharnais dataset. The Desharnais dataset is a collection of eighty one software projects of twelve attributes developed by a Canadian software house. The analysis includes studying the influence of each of the quality requirements attributes, as well as the influence of all quality requirements attributes combined when calculating software effort using regression and Artificial Neural Network (ANN) models. The evaluation criteria used in this investigation include the Mean of the Magnitude of Relative Error (MMRE), the Prediction Level (PRED), Root Mean Squared Error (RMSE), Mean Error and the Coefficient of determination (R2). Results show that the quality attribute Language is the most statistically significant when calculating software effort. Moreover, if all quality requirements attributes are eliminated in the training stage and software effort is predicted based on software size only, the value of the error (MMRE) is doubled.