Jacques Carette

SE
h-index33
8papers
18citations
Novelty14%
AI Score18

8 Papers

SCApr 30, 2010
Symbolic Domain Decomposition

Jacques Carette, Alan P. Sexton, Volker Sorge et al.

Decomposing the domain of a function into parts has many uses in mathematics. A domain may naturally be a union of pieces, a function may be defined by cases, or different boundary conditions may hold on different regions. For any particular problem the domain can be given explicitly, but when dealing with a family of problems given in terms of symbolic parameters, matters become more difficult. This article shows how hybrid sets, that is multisets allowing negative multiplicity, may be used to express symbolic domain decompositions in an efficient, elegant and uniform way, simplifying both computation and reasoning. We apply this theory to the arithmetic of piecewise functions and symbolic matrices and show how certain operations may be reduced from exponential to linear complexity.

SEFeb 9, 2018Code
State of the Practice for GIS Software

W. Spencer Smith, Adam Lazzarato, Jacques Carette

We present a reproducible method to analyze the state of software development practices in a given scientific domain and apply this method to Geographic Information Systems (GIS). The analysis is based on grading a set of 30 GIS products using a template of 56 questions based on 13 software qualities. The products range in scope and purpose from a complete desktop GIS systems, to stand-alone tools, to programming libraries/packages. The final ranking of the products is determined using the Analytic Hierarchy Process (AHP), a multicriteria decision making method that focuses on relative comparisons between products, rather than directly measuring qualities. The results reveal concerns regarding the correctness, maintainability, transparency and reproducibility of some GIS software. Three recommendations are presented as feedback to the GIS community: 1) Ensure each project has a requirements specification document, 2) Provide a wealth of support methods, such as an IRC (Internet Relay Chat) channel, a Stack Exchange tag for new questions, or opening the issue tracker for support requests, as well as the more traditional email-based methods, and, 3) Design product websites for maximum transparency (of the development process), for open source projects, provide a developer's guide.

SEMay 20, 2024
State of the Practice for Medical Imaging Software

W. Spencer Smith, Ao Dong, Jacques Carette et al.

We selected 29 medical imaging projects from 48 candidates, assessed 10 software qualities by answering 108 questions for each software project, and interviewed 8 of the 29 development teams. Based on the quantitative data, we ranked the MI software with the Analytic Hierarchy Process (AHP). The four top-ranked software products are 3D Slicer, ImageJ, Fiji, and OHIF Viewer. Generally, MI software is in a healthy state as shown by the following: we observed 88% of the documentation artifacts recommended by research software development guidelines, 100% of MI projects use version control tools, and developers appear to use the common quasi-agile research software development process. However, the current state of the practice deviates from the existing guidelines because of the rarity of some recommended artifacts, low usage of continuous integration (17% of the projects), low use of unit testing (about 50% of projects), and room for improvement with documentation (six of nine developers felt their documentation was not clear enough). From interviewing the developers, we identified five pain points and two qualities of potential concern: lack of development time, lack of funding, technology hurdles, ensuring correctness, usability, maintainability, and reproducibility. The interviewees proposed strategies to improve the state of the practice, to address the identified pain points, and to improve software quality. Combining their ideas with ours, we have the following list of recommendations: increase documentation, increase testing by enriching datasets, increase continuous integration usage, move to web applications, employ linters, use peer reviews, design for change, add assurance cases, and incorporate a "Generate All Things" approach.

SEDec 15, 2021
Long-Term Productivity Based on Science, not Preference

Spencer Smith, Jacques Carette

This position paper argues that decisions on processes, tools, techniques and software artifacts (such as user manuals, unit tests, design documents and code) for scientific software development should be driven by science, not by personal preference. Decisions should not be based on anecdotal evidence, gut instinct or the path of least resistance. Moreover, decisions should vary depending on the users and the context. In most cases of interest, this means that a longer term view should be adopted. We need to use a scientific approach based on unambiguous definitions, empirical evidence, hypothesis testing and rigorous processes. By developing an understanding of where input hours are spent, what most contributes to user satisfaction, and how to leverage knowledge produced, we can determine what interventions have the greatest value relative to the invested effort. We will be able to recommend software production processes that justify their value because the long-term output benefits are high compared to the required input resources. A preliminary definition of productivity is presented, along with ideas on how to potentially measure this quality. We briefly explore the idea of improving productivity via an approach where all artifacts are generated from codified knowledge.

SEOct 22, 2021
Methodology for Assessing the State of the Practice for Domain X

Spencer Smith, Jacques Carette, Peter Michalski et al.

To improve software development methods and tools for research software, we first need to understand the current state of the practice. Therefore, we have developed a methodology for assessing the state of the software development practices for a given research software domain. For each domain we wish to answer questions such as: i) What artifacts (documents, code, test cases, etc.) are present? ii) What tools are used? iii) What principles, process and methodologies are used? iv) What are the pain points for developers? v) What actions are used to improve qualities like maintainability and reproducibility? To answer these questions, our methodology prescribes the following steps: i) Identify the domain; ii) Identify a list of candidate software packages; iii) Filter the list to a length of about 30 packages; iv) Gather source code and documentation for each package; v) Collect repository related data on each software package, like number of stars, number of open issues, number of lines of code; vi) Fill in the measurement template (the template consists of 108 questions to assess 9 qualities (including the qualities of installability, usability and visibility)); vii) Interview developers (the interview consists of 20 questions and takes about an hour); viii) Rank the software using the Analytic Hierarchy Process (AHP); and, ix) Analyze the data to answer the questions posed above. A domain expert should be engaged throughout the process, to ensure that implicit information about the domain is properly represented and to assist with conducting an analysis of the commonalities and variabilities between the 30 selected packages. Using our methodology, spreadsheet templates and AHP tool, we estimate (based on our experience with using the process) the time to complete an assessment for a given domain at 173 person hours.

SESep 29, 2020
Long-term Productivity for Long-term Impact

Spencer Smith, Jacques Carette

We present a new conceptual definition of 'productivity' for sustainably developing research software. Existing definitions are flawed as they are short-term biased, thus devaluing long-term impact, which we consider to be the principal goal. Taking a long-term view of productivity helps fix that problem. We view the outputs of the development process as knowledge and user satisfaction. User satisfaction is used as a proxy for effective quality. The explicit emphasis on all knowledge produced, rather than just the operationalizable knowledge (code) implies that human-reusable knowledge, i.e. documentation, should also be greatly valued when producing research software.

MSApr 23, 2019
Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal

Jacques Carette, William M. Farmer, Michael Kohlhase et al.

Over the last decades, a class of important mathematical results have required an ever increasing amount of human effort to carry out. For some, the help of computers is now indispensable. We analyze the implications of this trend towards "big mathematics", its relation to human cognition, and how machine support for big math can be organized. The central contribution of this position paper is an information model for "doing mathematics", which posits that humans very efficiently integrate four aspects: inference, computation, tabulation, and narration around a well-organized core of mathematical knowledge. The challenge for mathematical software systems is that these four aspects need to be integrated as well. We briefly survey the state of the art.

SEFeb 20, 2018
Statistical Software for Psychology: Comparing Development Practices Between CRAN and Other Communities

Spencer Smith, Yue Sun, Jacques Carette

Different communities rely heavily on software, but use quite different software development practices. {\bf Objective}: We wanted to measure the state of the practice in the area of statistical software for psychology to understand how it compares to best practices. {\bf Method}: We compared and ranked 30 software tools with respect to adherence to best software engineering practices on items that could be measured by end-users. {\bf Results} We found that R packages use quite good practices, that while commercial packages were quite usable, many aspects of their development is too opaque to be measures, and that research projects vary a lot in their practices. {\bf Conclusion} We recommend that more organizations adopt practices similar to those used by CRAN to facilitate success, even for small teams. We also recommend close coupling of source code and documentation, to improve verifiability.