David Karger

HC
h-index2
10papers
532citations
Novelty44%
AI Score40

10 Papers

IRAug 9, 2023
Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes

Sharon Jiang, Shannon Shen, Monica Agrawal et al. · mit

The large amount of time clinicians spend sifting through patient notes and documenting in electronic health records (EHRs) is a leading cause of clinician burnout. By proactively and dynamically retrieving relevant notes during the documentation process, we can reduce the effort required to find relevant patient history. In this work, we conceptualize the use of EHR audit logs for machine learning as a source of supervision of note relevance in a specific clinical context, at a particular point in time. Our evaluation focuses on the dynamic retrieval in the emergency department, a high acuity setting with unique patterns of information retrieval and note writing. We show that our methods can achieve an AUC of 0.963 for predicting which notes will be read in an individual note writing session. We additionally conduct a user study with several clinicians and find that our framework can help clinicians retrieve relevant information more efficiently. Demonstrating that our framework and methods can perform well in this demanding setting is a promising proof of concept that they will translate to other clinical settings and data modalities (e.g., labs, medications, imaging).

SEMar 12
The Perfection Paradox: From Architect to Curator in AI-Assisted API Design

Mak Ahmad, Andrew Macvean, JJ Geewax et al.

Enterprise API design is often bottlenecked by the tension between rapid feature delivery and the rigorous maintenance of usability standards. We present an industrial case study evaluating an AI-assisted design workflow trained on API Improvement Proposals (AIPs). Through a controlled study with 16 industry experts, we compared AI-generated API specifications against human-authored ones. While quantitative results indicated AI superiority in 10 of 11 usability dimensions and an 87% reduction in authoring time, qualitative analysis revealed a paradox: experts frequently misidentified AI work as human (19% accuracy) yet described the designs as unsettlingly "perfect." We characterize this as a "Perfection Paradox" -- where hyper-consistency signals a lack of pragmatic human judgment. We discuss the implications of this perfection paradox, proposing a shift in the human designer's role from the "drafter" of specifications to the "curator" of AI-generated patterns.

HCMay 19, 2025
How Adding Metacognitive Requirements in Support of AI Feedback in Practice Exams Transforms Student Learning Behaviors

Mak Ahmad, Prerna Ravi, David Karger et al.

Providing personalized, detailed feedback at scale in large undergraduate STEM courses remains a persistent challenge. We present an empirically evaluated practice exam system that integrates AI generated feedback with targeted textbook references, deployed in a large introductory biology course. Our system encourages metacognitive behavior by asking students to explain their answers and declare their confidence. It uses OpenAI's GPT-4o to generate personalized feedback based on this information, while directing them to relevant textbook sections. Through interaction logs from consenting participants across three midterms (541, 342, and 413 students respectively), totaling 28,313 question-student interactions across 146 learning objectives, along with 279 surveys and 23 interviews, we examined the system's impact on learning outcomes and engagement. Across all midterms, feedback types showed no statistically significant performance differences, though some trends suggested potential benefits. The most substantial impact came from the required confidence ratings and explanations, which students reported transferring to their actual exam strategies. About 40 percent of students engaged with textbook references when prompted by feedback -- far higher than traditional reading rates. Survey data revealed high satisfaction (mean rating 4.1 of 5), with 82.1 percent reporting increased confidence on practiced midterm topics, and 73.4 percent indicating they could recall and apply specific concepts. Our findings suggest that embedding structured reflection requirements may be more impactful than sophisticated feedback mechanisms.

CYJan 27, 2021
Designing for Engaging with News using Moral Framing towards Bridging Ideological Divides

Jessica Wang, Amy Zhang, David Karger

Society is showing signs of strong ideological polarization. When pushed to seek perspectives different from their own, people often reject diverse ideas or find them unfathomable. Work has shown that framing controversial issues using the values of the audience can improve understanding of opposing views. In this paper, we present our work designing systems for addressing ideological division through educating U.S. news consumers to engage using a framework of fundamental human values known as Moral Foundations. We design and implement a series of new features that encourage users to challenge their understanding of opposing views, including annotation of moral frames in news articles, discussion of those frames via inline comments, and recommendations based on relevant moral frames. We describe two versions of features -- the first covering a suite of ways to interact with moral framing in news, and the second tailored towards collaborative annotation and discussion. We conduct a field evaluation of each design iteration with 71 participants in total over a period of 6-8 days, finding evidence suggesting users learned to re-frame their discourse in moral values of the opposing side. Our work provides several design considerations for building systems to engage with moral framing.

HCSep 16, 2020
A System for Interleaving Discussion and Summarization in Online Collaboration

Sunny Tian, Amy X. Zhang, David Karger

In many instances of online collaboration, ideation and deliberation about what to write happen separately from the synthesis of the deliberation into a cohesive document. However, this may result in a final document that has little connection to the discussion that came before. In this work, we present interleaved discussion and summarization, a process where discussion and summarization are woven together in a single space, and collaborators can switch back and forth between discussing ideas and summarizing discussion until it results in a final document that incorporates and references all discussion points. We implement this process into a tool called Wikum+ that allows groups working together on a project to create living summaries-artifacts that can grow as new collaborators, ideas, and feedback arise and shrink as collaborators come to consensus. We conducted studies where groups of six people each collaboratively wrote a proposal using Wikum+ and a proposal using a messaging platform along with Google Docs. We found that Wikum+'s integration of discussion and summarization helped users be more organized, allowing for light-weight coordination and iterative improvements throughout the collaboration process. A second study demonstrated that in larger groups, Wikum+ is more inclusive of all participants and more comprehensive in the final document compared to traditional tools.

LGJul 29, 2020
Fast, Structured Clinical Documentation via Contextual Autocomplete

Divya Gopinath, Monica Agrawal, Luke Murray et al.

We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical documentation. We dynamically suggest relevant clinical concepts as a doctor drafts a note by leveraging features from both unstructured and structured medical data. By constraining our architecture to shallow neural networks, we are able to make these suggestions in real time. Furthermore, as our algorithm is used to write a note, we can automatically annotate the documentation with clean labels of clinical concepts drawn from medical vocabularies, making notes more structured and readable for physicians, patients, and future algorithms. To our knowledge, this system is the only machine learning-based documentation utility for clinical notes deployed in a live hospital setting, and it reduces keystroke burden of clinical concepts by 67% in real environments.

LGMar 21, 2020
ARDA: Automatic Relational Data Augmentation for Machine Learning

Nadiia Chepurko, Ryan Marcus, Emanuel Zgraggen et al.

Automatic machine learning (\AML) is a family of techniques to automate the process of training predictive models, aiming to both improve performance and make machine learning more accessible. While many recent works have focused on aspects of the machine learning pipeline like model selection, hyperparameter tuning, and feature selection, relatively few works have focused on automatic data augmentation. Automatic data augmentation involves finding new features relevant to the user's predictive task with minimal ``human-in-the-loop'' involvement. We present \system, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set such that training a predictive model on this augmented dataset results in improved performance. Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join. We perform an extensive empirical evaluation of different system components and benchmark our feature selection algorithm on real-world datasets.

HCJan 8, 2020
Dark Patterns after the GDPR: Scraping Consent Pop-ups and Demonstrating their Influence

Midas Nouwens, Ilaria Liccardi, Michael Veale et al.

New consent management platforms (CMPs) have been introduced to the web to conform with the EU's General Data Protection Regulation, particularly its requirements for consent when companies collect and process users' personal data. This work analyses how the most prevalent CMP designs affect people's consent choices. We scraped the designs of the five most popular CMPs on the top 10,000 websites in the UK (n=680). We found that dark patterns and implied consent are ubiquitous; only 11.8% meet the minimal requirements that we set based on European law. Second, we conducted a field experiment with 40 participants to investigate how the eight most common designs affect consent choices. We found that notification style (banner or barrier) has no effect; removing the opt-out button from the first page increases consent by 22--23 percentage points; and providing more granular controls on the first page decreases consent by 8--20 percentage points. This study provides an empirical basis for the necessary regulatory action to enforce the GDPR, in particular the possibility of focusing on the centralised, third-party CMP services as an effective way to increase compliance.

DMMay 24, 2017
Matroids Hitting Sets and Unsupervised Dependency Grammar Induction

Nicholas Harvey, Vahab Mirrokni, David Karger et al.

This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs. This formulation is motivated by an un-supervised grammar induction problem from computational linguistics. We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm.

HCSep 23, 2014
Attendee-Sourcing: Exploring The Design Space of Community-Informed Conference Scheduling

Anant Bhardwaj, Juho Kim, Steven Dow et al.

Constructing a good conference schedule for a large multi-track conference needs to take into account the preferences and constraints of organizers, authors, and attendees. Creating a schedule which has fewer conflicts for authors and attendees, and thematically coherent sessions is a challenging task. Cobi introduced an alternative approach to conference scheduling by engaging the community to play an active role in the planning process. The current Cobi pipeline consists of committee-sourcing and author-sourcing to plan a conference schedule. We further explore the design space of community-sourcing by introducing attendee-sourcing -- a process that collects input from conference attendees and encodes them as preferences and constraints for creating sessions and schedule. For CHI 2014, a large multi-track conference in human-computer interaction with more than 3,000 attendees and 1,000 authors, we collected attendees' preferences by making available all the accepted papers at the conference on a paper recommendation tool we built called Confer, for a period of 45 days before announcing the conference program (sessions and schedule). We compare the preferences marked on Confer with the preferences collected from Cobi's author-sourcing approach. We show that attendee-sourcing can provide insights beyond what can be discovered by author-sourcing. For CHI 2014, the results show value in the method and attendees' participation. It produces data that provides more alternatives in scheduling and complements data collected from other methods for creating coherent sessions and reducing conflicts.