Daniel Neagu

3papers

249citations

Novelty32%

AI Score20

Ranked #191,080 of 201,326 authors (top 95%)#41,311 in LG (top 97%)

3 Papers

SEMar 13, 2017

Software stage-effort estimation based on association rule mining and fuzzy set theory

Mohammad Azzeh, Peter I Cowling, Daniel Neagu

Relaying on early effort estimation to predict the required number of resources is not often sufficient, and could lead to under or over estimation. It is widely acknowledge that that software development process should be refined regularly and that software prediction made at early stage of software development is yet kind of guesses. Even good predictions are not sufficient with inherent uncertainty and risks. The stage-effort estimation allows project manager to re-allocate correct number of resources, re-schedule project and control project progress to finish on time and within budget. In this paper we propose an approach to utilize prior effort records to predict stage effort. The proposed model combines concepts of Fuzzy set theory and association rule mining. The results were good in terms of prediction accuracy and have potential to deliver good stage-effort estimation.

SIOct 18, 2015

Social Media Analysis for Product Safety using Text Mining and Sentiment Analysis

Haruna Isah, Daniel Neagu, Paul Trundle

The growing incidents of counterfeiting and associated economic and health consequences necessitate the development of active surveillance systems capable of producing timely and reliable information for all stake holders in the anti-counterfeiting fight. User generated content from social media platforms can provide early clues about product allergies, adverse events and product counterfeiting. This paper reports a work in progresswith contributions including: the development of a framework for gathering and analyzing the views and experiences of users of drug and cosmetic products using machine learning, text mining and sentiment analysis, the application of the proposed framework on Facebook comments and data from Twitter for brand analysis, and the description of how to develop a product safety lexicon and training data for modeling a machine learning classifier for drug and cosmetic product sentiment prediction. The initial brand and product comparison results signify the usefulness of text mining and sentiment analysis on social media data while the use of machine learning classifier for predicting the sentiment orientation provides a useful tool for users, product manufacturers, regulatory and enforcement agencies to monitor brand or product sentiment trends in order to act in the event of sudden or significant rise in negative sentiment.

LGDec 4, 2013

Interpreting random forest classification models using a feature contribution method

Anna Palczewska, Jan Palczewski, Richard Marchese Robinson et al.

Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the availability of model parameters and their statistical significance. For "black box" models, such as random forest, this information is hidden inside the model structure. This work presents an approach for computing feature contributions for random forest classification models. It allows for the determination of the influence of each variable on the model prediction for an individual instance. By analysing feature contributions for a training dataset, the most significant variables can be determined and their typical contribution towards predictions made for individual classes, i.e., class-specific feature contribution "patterns", are discovered. These patterns represent a standard behaviour of the model and allow for an additional assessment of the model reliability for a new data. Interpretation of feature contributions for two UCI benchmark datasets shows the potential of the proposed methodology. The robustness of results is demonstrated through an extensive analysis of feature contributions calculated for a large number of generated random forest models.