Chelsea Boccagno

h-index21
2papers

2 Papers

74.1HCApr 7
Breaking Negative Cycles: A Reflection-To-Action System For Adaptive Change

Minsol Michelle Kim, Daniel M. Low, David Lafond et al.

Breaking negative mental health cycles, including rumination and recurring regrets, requires reflection that translates awareness into behavioral change. Grounded in the Transtheoretical Model (TTM) and Gross's Emotion Regulation (ER) Process Model, we examine how Technologies Supporting Self-Reflection (TSR) bridge reflection and action. In a 15-day in-the-wild study (N = 20), participants used a voice-based journaling system to capture regrets and wishes and engaged in WhatIf-Planning, a novel structured reflection module integrating counterfactual thinking with if-then planning. Participants were randomized to either a free-form condition or a Gross-guided condition, which maps the five processes of Gross's ER model into explicit journaling prompts. We contribute: (1) a unified reflection-to-action TSR system that operationalizes the Preparation stage of TTM to bridge Contemplation and Action, and (2) triangulated empirical evidence from an in-the-wild journaling study that first operationalizes Gross's Process Model, revealing effects on coping flexibility and emotion regulation in daily life. Results show significant pre-post improvements in coping flexibility, indicating adaptive self-regulation across conditions, with the Gross-guided group generating more counterfactual alternatives, articulating concrete if-then action plans, and implementing more plans for self-driven change.

CVOct 31, 2024
Using Multimodal Deep Neural Networks to Disentangle Language from Visual Aesthetics

Colin Conwell, Christopher Hamblin, Chelsea Boccagno et al. · mit

When we experience a visual stimulus as beautiful, how much of that experience derives from perceptual computations we cannot describe versus conceptual knowledge we can readily translate into natural language? Disentangling perception from language in visually-evoked affective and aesthetic experiences through behavioral paradigms or neuroimaging is often empirically intractable. Here, we circumnavigate this challenge by using linear decoding over the learned representations of unimodal vision, unimodal language, and multimodal (language-aligned) deep neural network (DNN) models to predict human beauty ratings of naturalistic images. We show that unimodal vision models (e.g. SimCLR) account for the vast majority of explainable variance in these ratings. Language-aligned vision models (e.g. SLIP) yield small gains relative to unimodal vision. Unimodal language models (e.g. GPT2) conditioned on visual embeddings to generate captions (via CLIPCap) yield no further gains. Caption embeddings alone yield less accurate predictions than image and caption embeddings combined (concatenated). Taken together, these results suggest that whatever words we may eventually find to describe our experience of beauty, the ineffable computations of feedforward perception may provide sufficient foundation for that experience.