NCOct 18, 2023
Getting aligned on representational alignmentIlia Sucholutsky, Lukas Muttenthaler, Adrian Weller et al. · berkeley, cambridge
Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the similarity between the representations formed by these diverse systems? Do similarities in representations then translate into similar behavior? If so, then how can a system's representations be modified to better match those of another system? These questions pertaining to the study of representational alignment are at the heart of some of the most promising research areas in contemporary cognitive science, neuroscience, and machine learning. In this Perspective, we survey the exciting recent developments in representational alignment research in the fields of cognitive science, neuroscience, and machine learning. Despite their overlapping interests, there is limited knowledge transfer between these fields, so work in one field ends up duplicated in another, and useful innovations are not shared effectively. To improve communication, we propose a unifying framework that can serve as a common language for research on representational alignment, and map several streams of existing work across fields within our framework. We also lay out open problems in representational alignment where progress can benefit all three of these fields. We hope that this paper will catalyze cross-disciplinary collaboration and accelerate progress for all communities studying and developing information processing systems.
AINov 1, 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communitiesSenjuti Dutta, Sid Mittal, Sherol Chen et al.
The prevalence and impact of toxic discussions online have made content moderation crucial.Automated systems can play a vital role in identifying toxicity, and reducing the reliance on human moderation.Nevertheless, identifying toxic comments for diverse communities continues to present challenges that are addressed in this paper.The two-part goal of this study is to(1)identify intuitive variances from annotator disagreement using quantitative analysis and (2)model the subjectivity of these viewpoints.To achieve our goal, we published a new dataset\footnote{\url{https://github.com/XXX}} with expert annotators' annotations and used two other public datasets to identify the subjectivity of toxicity.Then leveraging the Large Language Model(LLM),we evaluate the model's ability to mimic diverse viewpoints on toxicity by varying size of the training data and utilizing same set of annotators as the test set used during model training and a separate set of annotators as the test set.We conclude that subjectivity is evident across all annotator groups, demonstrating the shortcomings of majority-rule voting. Moving forward, subjective annotations should serve as ground truth labels for training models for domains like toxicity in diverse communities.
AIJul 13, 2023
Leveraging Contextual Counterfactuals Toward Belief CalibrationQiuyi, Zhang, Michael S. Lee et al.
Beliefs and values are increasingly being incorporated into our AI systems through alignment processes, such as carefully curating data collection principles or regularizing the loss function used for training. However, the meta-alignment problem is that these human beliefs are diverse and not aligned across populations; furthermore, the implicit strength of each belief may not be well calibrated even among humans, especially when trying to generalize across contexts. Specifically, in high regret situations, we observe that contextual counterfactuals and recourse costs are particularly important in updating a decision maker's beliefs and the strengths to which such beliefs are held. Therefore, we argue that including counterfactuals is key to an accurate calibration of beliefs during alignment. To do this, we first segment belief diversity into two categories: subjectivity (across individuals within a population) and epistemic uncertainty (within an individual across different contexts). By leveraging our notion of epistemic uncertainty, we introduce `the belief calibration cycle' framework to more holistically calibrate this diversity of beliefs with context-driven counterfactual reasoning by using a multi-objective optimization. We empirically apply our framework for finding a Pareto frontier of clustered optimal belief strengths that generalize across different contexts, demonstrating its efficacy on a toy dataset for credit decisions.
HCMar 21, 2024
A Design Space for Intelligent and Interactive Writing AssistantsMina Lee, Katy Ilonka Gero, John Joon Young Chung et al. · allen-ai, deepmind
In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.
HCFeb 27, 2024
Understanding Subjectivity through the Lens of Motivational Context in Model-Generated Image SatisfactionSenjuti Dutta, Sherol Chen, Sunny Mak et al.
Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use cases. Simulating the effects of ordinarily latent elements of annotators subjectivity, we contrive a set of motivations (t-shirt graphics, presentation visuals, and phone background images) to contextualize a set of crowdsourcing tasks. Our results show that human evaluations of images vary within individual contexts and across combinations of contexts. Three key factors affecting this subjectivity are image appearance, image alignment with text, and representation of objects mentioned in the text. Our study highlights the importance of taking individual users and contexts into account, both when building and evaluating generative models
LGJun 24, 2024
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image GenerationKatherine M. Collins, Najoung Kim, Yonatan Bitton et al.
Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established. This paper investigates the effectiveness of fine-grained feedback which captures nuanced distinctions in image quality and prompt-alignment, compared to traditional coarse-grained feedback (for example, thumbs up/down or ranking between a set of options). While fine-grained feedback holds promise, particularly for systems catering to diverse societal preferences, we show that demonstrating its superiority to coarse-grained feedback is not automatic. Through experiments on real and synthetic preference data, we surface the complexities of building effective models due to the interplay of model choice, feedback type, and the alignment between human judgment and computational interpretation. We identify key challenges in eliciting and utilizing fine-grained feedback, prompting a reassessment of its assumed benefits and practicality. Our findings -- e.g., that fine-grained feedback can lead to worse models for a fixed budget, in some settings; however, in controlled settings with known attributes, fine grained rewards can indeed be more helpful -- call for careful consideration of feedback attributes and potentially beckon novel modeling approaches to appropriately unlock the potential value of fine-grained feedback in-the-wild.