Fernando Nogueira

LGSep 21, 2016Code

Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning

Guillaume Lemaitre, Fernando Nogueira, Christos K. Aridas

Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented state-of-the-art methods can be categorized into 4 groups: (i) under-sampling, (ii) over-sampling, (iii) combination of over- and under-sampling, and (iv) ensemble learning methods. The proposed toolbox only depends on numpy, scipy, and scikit-learn and is distributed under MIT license. Furthermore, it is fully compatible with scikit-learn and is part of the scikit-learn-contrib supported project. Documentation, unit tests as well as integration tests are provided to ease usage and contribution. The toolbox is publicly available in GitHub: https://github.com/scikit-learn-contrib/imbalanced-learn.

61.5SEApr 30

One Size Fits All? An Empirical Comparison of ADR Templates regarding Comprehension, Usability, and Ease of Adoption

Fernando Nogueira, Nabson Silva, Tayana Conte

Context: Documenting Architectural Design Decisions (ADDs) is a critical factor in the software lifecycle, essential for efficient system maintenance, developer onboarding, and preventing knowledge vaporization. Although various templates for Architectural Decision Records (ADRs) have been proposed, there is a lack of empirical evidence comparing them. Goal: To address this gap, this paper aims to identify which ADR template best supports comprehension, usability, and ease of adoption: Tyree/Akerman's template, Nygard's ADR, arc42, Y-statements, and MADR. Method: We compared these templates using the DESMET FA method in a two-step evaluation. First, the two primary authors evaluated the five templates through the DESMET FA, based on their software architecture expertise. The two top-performing templates were then used as treatments in a controlled experiment conducted with undergraduate students. Results: In the preliminary screening by experts, the top-performing templates were those of Nygard and MADR. In the subsequent controlled experiment, Nygard's template outperformed MADR in terms of the Overall Score. Qualitative analysis of participant feedback revealed the factors influencing template preference. The findings indicate that Nygard supports concise and objective documentation, while MADR facilitates structural details and specific architectural requirements. Conclusion: This paper provides an evidence-based strategy for ADR template adoption by offering a comparison between them. The findings present a decision-making guide that assists practitioners and researchers in selecting ADR templates aligned with project constraints, aiming to minimize documentation overhead and increase architectural knowledge retention.

Fernando Nogueira

2 Papers