37.3SEMar 20
The Nature of Technical Debt in Research SoftwareNeil A. Ernst, Ahmed Musa Awon, Swapnil Hingmire et al.
Research software (also called scientific software) is essential for advancing scientific endeavours. Research software encapsulates complex algorithms and domain-specific knowledge and is a fundamental component of all science. A pervasive challenge in developing research software is technical debt, which can adversely affect reliability, maintainability, and scientific validity. Research software often relies on the initiative of the scientific community for maintenance, requiring diverse expertise in both scientific and software engineering domains. The extent and nature of technical debt in research software are little studied, in particular, what forms it takes, and what the science teams developing this software think about their technical debt. In this paper we describe our multi-method study examining technical debt in research software. We begin by examining instances of self-reported technical debt in research code, examining 28k code comments across nine research software projects. Then, building on our findings, we interview research software engineers and scientists about how this technical debt manifests itself in their experience, and what costs it has for research software and research outputs more generally. We identify nine types of self-admitted technical debt unique to research software, and four themes impacting this technical debt.
CLJul 25, 2025
Objectifying the Subjective: Cognitive Biases in Topic InterpretationsSwapnil Hingmire, Ze Shi Li, Shiyu et al.
Interpretation of topics is crucial for their downstream applications. State-of-the-art evaluation measures of topic quality such as coherence and word intrusion do not measure how much a topic facilitates the exploration of a corpus. To design evaluation measures grounded on a task, and a population of users, we do user studies to understand how users interpret topics. We propose constructs of topic quality and ask users to assess them in the context of a topic and provide rationale behind evaluations. We use reflexive thematic analysis to identify themes of topic interpretations from rationales. Users interpret topics based on availability and representativeness heuristics rather than probability. We propose a theory of topic interpretation based on the anchoring-and-adjustment heuristic: users anchor on salient words and make semantic adjustments to arrive at an interpretation. Topic interpretation can be viewed as making a judgment under uncertainty by an ecologically rational user, and hence cognitive biases aware user models and evaluation frameworks are needed.
CLJan 7, 2022
A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph GenerationIrene Li, Thomas George, Alexander Fabbri et al.
Effective human learning depends on a wide selection of educational materials that align with the learner's current understanding of the topic. While the Internet has revolutionized human learning or education, a substantial resource accessibility barrier still exists. Namely, the excess of online information can make it challenging to navigate and discover high-quality learning materials. In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains. The pipeline consists of three main steps: data collection, feature extraction, and resource classification. We start with a known source domain and conduct resource discovery on two unseen target domains via transfer learning. We first collect frequent queries from a set of seed documents and search on the web to obtain candidate resources, such as lecture slides and introductory blog posts. Then we introduce a novel pretrained information retrieval deep neural network model, query-document masked language modeling (QD-MLM), to extract deep features of these candidate resources. We apply a tree-based classifier to decide whether the candidate is a positive learning resource. The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains. Finally, we demonstrate how this pipeline can benefit an application: leading paragraph generation for surveys. This is the first study that considers various web resources for survey generation, to the best of our knowledge. We also release a corpus of 39,728 manually labeled web resources and 659 queries from NLP, Computer Vision (CV), and Statistics (STATS).
CLDec 16, 2021
CLICKER: A Computational LInguistics Classification Scheme for Educational ResourcesSwapnil Hingmire, Irene Li, Rena Kawamura et al.
A classification scheme of a scientific subject gives an overview of its body of knowledge. It can also be used to facilitate access to research articles and other materials related to the subject. For example, the ACM Computing Classification System (CCS) is used in the ACM Digital Library search interface and also for indexing computer science papers. We observed that a comprehensive classification system like CCS or Mathematics Subject Classification (MSC) does not exist for Computational Linguistics (CL) and Natural Language Processing (NLP). We propose a classification scheme -- CLICKER for CL/NLP based on the analysis of online lectures from 77 university courses on this subject. The currently proposed taxonomy includes 334 topics and focuses on educational aspects of CL/NLP; it is based primarily, but not exclusively, on lecture notes from NLP courses. We discuss how such a taxonomy can help in various real-world applications, including tutoring platforms, resource retrieval, resource recommendation, prerequisite chain learning, and survey generation.
CLApr 22, 2020
R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain LearningIrene Li, Alexander Fabbri, Swapnil Hingmire et al.
The task of concept prerequisite chain learning is to automatically determine the existence of prerequisite relationships among concept pairs. In this paper, we frame learning prerequisite relationships among concepts as an unsupervised task with no access to labeled concept pairs during training. We propose a model called the Relational-Variational Graph AutoEncoder (R-VGAE) to predict concept relations within a graph consisting of concept and resource nodes. Results show that our unsupervised approach outperforms graph-based semi-supervised methods and other baseline methods by up to 9.77% and 10.47% in terms of prerequisite relation prediction accuracy and F1 score. Our method is notably the first graph-based model that attempts to make use of deep learning representations for the task of unsupervised prerequisite learning. We also expand an existing corpus which totals 1,717 English Natural Language Processing (NLP)-related lecture slide files and manual concept pair annotations over 322 topics.
CLDec 4, 2017
Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text ClassificationSachin Pawar, Nitin Ramrakhiyani, Swapnil Hingmire et al.
We propose a Label Propagation based algorithm for weakly supervised text classification. We construct a graph where each document is represented by a node and edge weights represent similarities among the documents. Additionally, we discover underlying topics using Latent Dirichlet Allocation (LDA) and enrich the document graph by including the topics in the form of additional nodes. The edge weights between a topic and a text document represent level of "affinity" between them. Our approach does not require document level labelling, instead it expects manual labels only for topic nodes. This significantly minimizes the level of supervision needed as only a few topics are observed to be enough for achieving sufficiently high accuracy. The Label Propagation Algorithm is employed on this enriched graph to propagate labels among the nodes. Our approach combines the advantages of Label Propagation (through document-document similarities) and Topic Modelling (for minimal but smart supervision). We demonstrate the effectiveness of our approach on various datasets and compare with state-of-the-art weakly supervised text classification approaches.