3.1SEApr 26Code
Characterizing the Usefulness of Code Review Comments in Scientific Software for Software Quality and Scientific RigorSharif Ahmed, Nasir U. Eisty
Context: Innovation thrives on scientific software, with useful code review feedback enhancing its correctness and impact. However, unlike general-purpose commercial and open-source software, the usefulness of code review feedback (CR comment) in scientific software remains largely unstudied. Objective: This paper aims to characterize the usefulness of CR comment in scientific opens ource software (Sci-OSS), leveraging existing research on useful CR comment. Method: To achieve this objective, we mine successful Sci-OSS from GitHub, analyze their CR comments with usefulness related features, and compare the findings from prior research on general-purpose commercial and open-source CR comments. Results: The investigation on the usefulness of CR comments in SciOSS confirms many characteristics that prior research identified in general-purpose software. For example, subjective or negative CR comments remain not useful for the Sci-OSS. We also find CR comments which receive negative emoji reactions have a very small correlation with not useful comments, whereas the positive emojis show mixed correlations. Importantly, 6-33% CR comments in Sci-OSS are not useful in our mined repositories. Conclusions: Our investigation into Sci-OSS extends findings from CR comments' usefulness research on general-purpose software, benefiting developers, scientists, and researchers in the Sci-OSS community.
LGSep 11, 2024
Three-Dimensional, Multimodal Synchrotron Data for Machine Learning ApplicationsCalum Green, Sharif Ahmed, Shashidhara Marathe et al.
Machine learning techniques are being increasingly applied in medical and physical sciences across a variety of imaging modalities; however, an important issue when developing these tools is the availability of good quality training data. Here we present a unique, multimodal synchrotron dataset of a bespoke zinc-doped Zeolite 13X sample that can be used to develop advanced deep learning and data fusion pipelines. Multi-resolution micro X-ray computed tomography was performed on a zinc-doped Zeolite 13X fragment to characterise its pores and features, before spatially resolved X-ray diffraction computed tomography was carried out to characterise the homogeneous distribution of sodium and zinc phases. Zinc absorption was controlled to create a simple, spatially isolated, two-phase material. Both raw and processed data is available as a series of Zenodo entries. Altogether we present a spatially resolved, three-dimensional, multimodal, multi-resolution dataset that can be used for the development of machine learning techniques. Such techniques include development of super-resolution, multimodal data fusion, and 3D reconstruction algorithm development.