SEDec 15, 2021
Long-Term Productivity Based on Science, not PreferenceSpencer Smith, Jacques Carette
This position paper argues that decisions on processes, tools, techniques and software artifacts (such as user manuals, unit tests, design documents and code) for scientific software development should be driven by science, not by personal preference. Decisions should not be based on anecdotal evidence, gut instinct or the path of least resistance. Moreover, decisions should vary depending on the users and the context. In most cases of interest, this means that a longer term view should be adopted. We need to use a scientific approach based on unambiguous definitions, empirical evidence, hypothesis testing and rigorous processes. By developing an understanding of where input hours are spent, what most contributes to user satisfaction, and how to leverage knowledge produced, we can determine what interventions have the greatest value relative to the invested effort. We will be able to recommend software production processes that justify their value because the long-term output benefits are high compared to the required input resources. A preliminary definition of productivity is presented, along with ideas on how to potentially measure this quality. We briefly explore the idea of improving productivity via an approach where all artifacts are generated from codified knowledge.
SEOct 22, 2021
Methodology for Assessing the State of the Practice for Domain XSpencer Smith, Jacques Carette, Peter Michalski et al.
To improve software development methods and tools for research software, we first need to understand the current state of the practice. Therefore, we have developed a methodology for assessing the state of the software development practices for a given research software domain. For each domain we wish to answer questions such as: i) What artifacts (documents, code, test cases, etc.) are present? ii) What tools are used? iii) What principles, process and methodologies are used? iv) What are the pain points for developers? v) What actions are used to improve qualities like maintainability and reproducibility? To answer these questions, our methodology prescribes the following steps: i) Identify the domain; ii) Identify a list of candidate software packages; iii) Filter the list to a length of about 30 packages; iv) Gather source code and documentation for each package; v) Collect repository related data on each software package, like number of stars, number of open issues, number of lines of code; vi) Fill in the measurement template (the template consists of 108 questions to assess 9 qualities (including the qualities of installability, usability and visibility)); vii) Interview developers (the interview consists of 20 questions and takes about an hour); viii) Rank the software using the Analytic Hierarchy Process (AHP); and, ix) Analyze the data to answer the questions posed above. A domain expert should be engaged throughout the process, to ensure that implicit information about the domain is properly represented and to assist with conducting an analysis of the commonalities and variabilities between the 30 selected packages. Using our methodology, spreadsheet templates and AHP tool, we estimate (based on our experience with using the process) the time to complete an assessment for a given domain at 173 person hours.
SESep 29, 2020
Long-term Productivity for Long-term ImpactSpencer Smith, Jacques Carette
We present a new conceptual definition of 'productivity' for sustainably developing research software. Existing definitions are flawed as they are short-term biased, thus devaluing long-term impact, which we consider to be the principal goal. Taking a long-term view of productivity helps fix that problem. We view the outputs of the development process as knowledge and user satisfaction. User satisfaction is used as a proxy for effective quality. The explicit emphasis on all knowledge produced, rather than just the operationalizable knowledge (code) implies that human-reusable knowledge, i.e. documentation, should also be greatly valued when producing research software.
SEDec 31, 2019
Building Confidence in Scientific Computing Software Via Assurance CasesSpencer Smith, Mojdeh Sayari Nejad, Alan Wassyng
Assurance cases provide an organized and explicit argument for correctness. They can dramatically improve the certification of Scientific Computing Software (SCS). Assurance cases have already been effectively used for safety cases for real time systems. Their advantages for SCS include engaging domain experts, producing only necessary documentation, and providing evidence that can be verified/replicated. This paper illustrates assurance cases for SCS through the correctness case for 3dfim+, an existing Medical Imaging Application (MIA) for analyzing activity in the brain. This example was partly chosen because of recent concerns about the validity of fMRI (Functional Magnetic Resonance Imaging) studies. The example justifies the value of assurance cases for SCS, since the existing documentation is shown to have ambiguities and omissions, such as an incompletely defined ranking function and missing details on the coordinate system. A serious concern for 3dfim+ is identified: running the software does not produce any warning about the necessity of using data that matches the parametric statistical model employed for the correlation calculations. Raising the bar for SCS in general, and MIA in particular, is both feasible and necessary - when software impacts safety, an assurance case methodology (or an equivalently rigorous confidence building methodology) should be employed.
SEJun 18, 2019
Debunking the Myth that Upfront Requirements are Infeasible for Scientific Computing SoftwareSpencer Smith, Malavika Srinivasan, Sumanth Shankar
Many in the Scientific Computing Software community believe that upfront requirements are impossible, or at least infeasible. This paper shows requirements are feasible with the following: i) an appropriate perspective ("faking" the final documentation as if requirements were correct and complete from the start, and gathering requirements as if for a family of programs); ii) the aid of the right principles (abstraction, separation of concerns, anticipation of change, and generality); iii) employing SCS specific templates (for Software Requirements and Module Interface Specification); iv) using a design process that enables change (information hiding); and, v) the aid of modern tools (version control, issue tracking, checking, generation and automation tools). Not only are upfront requirements feasible, they provide significant benefits, including facilitating communication, early identification of errors, better design decisions and enabling replicability. The topics listed above are explained, justified and illustrated via an example of software developed by a small team of software and mechanical engineers for modelling the solidification of a metal alloy.
SEFeb 20, 2018
Statistical Software for Psychology: Comparing Development Practices Between CRAN and Other CommunitiesSpencer Smith, Yue Sun, Jacques Carette
Different communities rely heavily on software, but use quite different software development practices. {\bf Objective}: We wanted to measure the state of the practice in the area of statistical software for psychology to understand how it compares to best practices. {\bf Method}: We compared and ranked 30 software tools with respect to adherence to best software engineering practices on items that could be measured by end-users. {\bf Results} We found that R packages use quite good practices, that while commercial packages were quite usable, many aspects of their development is too opaque to be measures, and that research projects vary a lot in their practices. {\bf Conclusion} We recommend that more organizations adopt practices similar to those used by CRAN to facilitate success, even for small teams. We also recommend close coupling of source code and documentation, to improve verifiability.