SEApr 13
What's DAT? Three Case Studies of Measuring Software Development Productivity at Meta With Diff Authoring TimeMoritz Beller, Amanda Park, Karim Nakad et al.
This paper introduces Diff Authoring Time (DAT), a powerful, yet conceptually simple approach to measuring software development productivity that enables rigorous experimentation. DAT is a time based metric, which assess how long engineers take to develop changes, using a privacy-aware telemetry system integrated with version control, the IDE, and the OS. We validate DAT through observational studies, surveys, visualizations, and descriptive statistics. At Meta, DAT has powered experiments and case studies on more than 20 projects. Here, we highlight (1) an experiment on introducing mock types (a 14% DAT improvement), (2) the development of automatic memoization in the React compiler (33% improvement), and (3) an estimate of thousands of DAT hours saved annually through code sharing (> 50% improvement). DAT offers a precise, yet high-coverage measure for development productivity, aiding business decisions. It enhances development efficiency by aligning the internal development workflow with the experiment-driven culture of external product development. On the research front, DAT has enabled us to perform rigorous experimentation on long-standing software engineering questions such as "do types make development more efficient?"
AIDec 6, 2018
Project Rosetta: A Childhood Social, Emotional, and Behavioral Developmental OntologyAlyson Maslowski, Halim Abbas, Kelley Abrams et al.
There is a wide array of existing instruments used to assess childhood behavior and development for the evaluation of social, emotional and behavioral disorders. Many of these instruments either focus on one diagnostic category or encompass a broad set of childhood behaviors. We built an extensive ontology of the questions associated with key features that have diagnostic relevance for child behavioral conditions, such as Autism Spectrum Disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), and anxiety, by incorporating a subset of existing child behavioral instruments and categorizing each question into clinical domains. Each existing question and set of question responses were then mapped to a new unique Rosetta question and set of answer codes encompassing the semantic meaning and identified concept(s) of as many existing questions as possible. This resulted in 1274 existing instrument questions mapping to 209 Rosetta questions creating a minimal set of questions that are comprehensive of each topic and subtopic. This resulting ontology can be used to create more concise instruments across various ages and conditions, as well as create more robust overlapping datasets for both clinical and research use.
CYMar 15, 2017
Machine learning approach for early detection of autism by combining questionnaire and home video screeningHalim Abbas, Ford Garberson, Eric Glover et al.
Existing screening tools for early detection of autism are expensive, cumbersome, time-intensive, and sometimes fall short in predictive value. In this work, we apply Machine Learning (ML) to gold standard clinical data obtained across thousands of children at risk for autism spectrum disorders to create a low-cost, quick, and easy to apply autism screening tool that performs as well or better than most widely used standardized instruments. This new tool combines two screening methods into a single assessment, one based on short, structured parent-report questionnaires and the other on tagging key behaviors from short, semi-structured home videos of children. To overcome the scarcity, sparsity, and imbalance of training data, we apply creative feature selection, feature engineering, and novel feature encoding techniques. We allow for inconclusive determination where appropriate in order to boost screening accuracy when conclusive. We demonstrate a significant accuracy improvement over standard screening tools in a clinical study sample of 162 children.