Sanja Šćepanović

HC
h-index29
14papers
116citations
Novelty33%
AI Score50

14 Papers

HCJul 24, 2024
Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts

Edyta Bogucka, Marios Constantinides, Sanja Šćepanović et al.

In the evolving landscape of AI regulation, it is crucial for companies to conduct impact assessments and document their compliance through comprehensive reports. However, current reports lack grounding in regulations and often focus on specific aspects like privacy in relation to AI systems, without addressing the real-world uses of these systems. Moreover, there is no systematic effort to design and evaluate these reports with both AI practitioners and AI compliance experts. To address this gap, we conducted an iterative co-design process with 14 AI practitioners and 6 AI compliance experts and proposed a template for impact assessment reports grounded in the EU AI Act, NIST's AI Risk Management Framework, and ISO 42001 AI Management System. We evaluated the template by producing an impact assessment report for an AI-based meeting companion at a major tech company. A user study with 8 AI practitioners from the same company and 5 AI compliance experts from industry and academia revealed that our template effectively provides necessary information for impact assessments and documents the broad impacts of AI systems. Participants envisioned using the template not only at the pre-deployment stage for compliance but also as a tool to guide the design stage of AI uses.

CYJul 9, 2023
Dream Content Discovery from Reddit with an Unsupervised Mixed-Method Approach

Anubhab Das, Sanja Šćepanović, Luca Maria Aiello et al.

Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we developed a new, data-driven mixed-method approach for identifying topics in free-form dream reports through natural language processing. We tested this method on 44,213 dream reports from Reddit's r/Dreams subreddit, where we found 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional scales, our method can find unique patterns in different dream types (like nightmares or recurring dreams), understand topic importance and connections, and observe changes in collective dream experiences over time and around major events, like the COVID-19 pandemic and the recent Russo-Ukrainian war. We envision that the applications of our method will provide valuable insights into the intricate nature of dreaming.

57.3HCMay 20
The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents

Julia De Miguel Velázquez, Sanja Šćepanović, Andrés Gvirtz et al.

Recent human-computer interaction (HCI) research has revealed a widespread misalignment between how developers design workplace artificial intelligence (AI) systems, and what workers actually need from them. Yet, little research has examined the effects of this gap, or how it may cause harm. We analyzed 1,524 reports of incidents in which AI systems were used to perform 171 occupational tasks across 12 industry sectors. Using an Large Language Model (LLM)-as-an-expert approach, we extracted the main traits of the AI systems involved in those incidents using an established framework of twelve traits. We then compared them with the traits that 202 workers highly familiar with those tasks would have preferred. We found that as many as 83\% of workplace incidents stem from worker-AI misalignments. In most cases, workers wanted systems that are precise, insightful, or personal, but instead received systems that are basic, simple, or general. Over the years, fast AI caused a considerable number of incidents, yet these declined, and imaginative AI, with the mass introduction of generative AI, started to cause incidents. We also compared the traits causing the incidents with the traits that 197 developers building AI systems for those tasks would have preferred. If the traits causing the incidents were the same as those designed by developers, then developers may be responsible for those incidents. We found that 74\% of task misalignments could be attributed to developers who tended to overfocus on efficiency and speed, especially for systems performing tasks in people-facing occupations such as those in the human resources sector. Our results call for design interventions that better align AI development with workers' needs, as without such corrections, workplace AI incidents are likely to persist, causing the invisible erosion of worker agency and organizational productivity.

56.6HCApr 30
When and How AI Should Assist Brainstorming for AI Impact Assessment

Jarod Govers, Sanja Šćepanović, Daniele Quercia

A key task in AI practice is to assess potential impacts to prevent harm. Current AI tools assisting AI impact assessment have not been designed or evaluated for collaborative team brainstorming, and they do not capture the range of views in diverse teams. We studied how AI can support team brainstorming during AI impact assessment and made three contributions. First, we adapted two structured methods from strategic foresight and co-designed AI interventions for them in five in-person workshops with 28 participants in total. Second, we evaluated the interventions in ten in-person workshops with 54 participants, finding that AI improved impact assessment quality and brainstorming perceptions for a general-purpose AI use (a chatbot companion) but not for a specialised one (a kidney allocation application). Third, our findings result in broader design guidance for AI assistance in brainstorming: AI should only offer hints and not solutions during early ideation, initiating interaction only when participants face fixation or saturation; it should facilitate structuring ideas during convergence; leverage expertise to refine ideas; and overall, it should serve more in support of tedious brainstorming process tasks, rather than ideation that teams value to do themselves.

61.2CYApr 27
Why AI Harms Can't Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality

Edyta Bogucka, Sanja Šćepanović, Daniele Quercia

AI risk assessment is the primary tool for identifying harms caused by AI systems. These include intersectional harms, which arise from the interaction between identity categories (e.g., class and skin tone) and which do not occur, or occur differently, when those categories are considered separately. Yet existing AI risk assessments are still built around isolated identity categories, and when intersections are considered, they focus almost exclusively on race and gender. Drawing on a large-scale analysis of documented AI incidents, we show that AI harms do not occur one identity category at a time. Using a structured rubric applied with a Large Language Model (LLM), we analyze 5,300 reports from 1,200 documented incidents in the AI Incident Database, the most curated source of incident data. From these reports, we identify 1,513 harmed subjects and their associated identity categories, achieving 98% accuracy. At the level of individual categories, we find that age and political identity appear in documented AI harms at rates comparable to race and gender. At the level of intersecting categories, harm is amplified up to three times at specific intersections: adolescent girls, lower-class people of color, and upper-class political elites. We argue that intersectionality should be a core component of AI risk assessment to more accurately capture how harms are produced and distributed across social groups.

LGJan 3, 2025
How Your Location Relates to Health: Variable Importance and Interpretable Machine Learning for Environmental and Sociodemographic Data

Ishaan Maitra, Raymond Lin, Eric Chen et al.

Health outcomes depend on complex environmental and sociodemographic factors whose effects change over location and time. Only recently has fine-grained spatial and temporal data become available to study these effects, namely the MEDSAT dataset of English health, environmental, and sociodemographic information. Leveraging this new resource, we use a variety of variable importance techniques to robustly identify the most informative predictors across multiple health outcomes. We then develop an interpretable machine learning framework based on Generalized Additive Models (GAMs) and Multiscale Geographically Weighted Regression (MGWR) to analyze both local and global spatial dependencies of each variable on various health outcomes. Our findings identify NO2 as a global predictor for asthma, hypertension, and anxiety, alongside other outcome-specific predictors related to occupation, marriage, and vegetation. Regional analyses reveal local variations with air pollution and solar radiation, with notable shifts during COVID. This comprehensive approach provides actionable insights for addressing health disparities, and advocates for the integration of interpretable machine learning in public health.

CYAug 21, 2025
The AI Model Risk Catalog: What Developers and Researchers Miss About Real-World AI Harms

Pooja S. B. Rao, Sanja Šćepanović, Dinesh Babu Jayagopi et al.

We analyzed nearly 460,000 AI model cards from Hugging Face to examine how developers report risks. From these, we extracted around 3,000 unique risk mentions and built the \emph{AI Model Risk Catalog}. We compared these with risks identified by researchers in the MIT Risk Repository and with real-world incidents from the AI Incident Database. Developers focused on technical issues like bias and safety, while researchers emphasized broader social impacts. Both groups paid little attention to fraud and manipulation, which are common harms arising from how people interact with AI. Our findings show the need for clearer, structured risk reporting that helps developers think about human-interaction and systemic risks early in the design process. The catalog and paper appendix are available at: https://social-dynamics.net/ai-risks/catalog.

CYAug 18, 2025
Vitamin N: Benefits of Different Forms of Public Greenery for Urban Health

Sanja Šćepanović, Sagar Joglekar, Stephen Law et al.

Urban greenery is often linked to better health, yet findings from past research have been inconsistent. One reason is that official greenery metrics measure the amount or nearness of greenery but ignore how often people actually may potentially see or use it in daily life. To address this gap, we introduced a new classification that separates on-road greenery, which people see while walking through streets, from off-road greenery, which requires planned visits. We did so by combining aerial imagery of Greater London and greenery data from OpenStreetMap with quantified greenery from over 100,000 Google Street View images and accessibility estimates based on 160,000 road segments. We linked these measures to 7.45 billion medical prescriptions issued by the National Health Service and processed through our methodology. These prescriptions cover five conditions: diabetes, hypertension, asthma, depression, and anxiety, as well as opioid use. As hypothesized, we found that green on-road was more strongly linked to better health than four widely used official measures. For example, hypertension prescriptions dropped by 3.68% in wards with on-road greenery above the median citywide level compared to those below it. If all below-median wards reached the citywide median in on-road greenery, prescription costs could fall by up to £3.15 million each year. These results suggest that greenery seen in daily life may be more relevant than public yet secluded greenery, and that official metrics commonly used in the literature have important limitations.

SIFeb 2, 2022
Epidemic Dreams: Dreaming about health during the COVID-19 pandemic

Sanja Šćepanović, Luca Maria Aiello, Deirdre Barrett et al.

The continuity hypothesis of dreams suggests that the content of dreams is continuous with the dreamer's waking experiences. Given the unprecedented nature of the experiences during COVID-19, we studied the continuity hypothesis in the context of the pandemic. We implemented a deep-learning algorithm that can extract mentions of medical conditions from text and applied it to two datasets collected during the pandemic: 2,888 dream reports (dreaming life experiences), and 57M tweets mentioning the pandemic (waking life experiences). The health expressions common to both sets were typical COVID-19 symptoms (e.g., cough, fever, and anxiety), suggesting that dreams reflected people's real-world experiences. The health expressions that distinguished the two sets reflected differences in thought processes: expressions in waking life reflected a linear and logical thought process and, as such, described realistic symptoms or related disorders (e.g., nasal pain, SARS, H1N1); those in dreaming life reflected a thought process closer to the visual and emotional spheres and, as such, described either conditions unrelated to the virus (e.g., maggots, deformities, snakebites), or conditions of surreal nature (e.g., teeth falling out, body crumbling into sand). Our results confirm that dream reports represent an understudied yet valuable source of people's health experiences in the real world.

HCSep 13, 2021
ComFeel: Productivity is a Matter of the Senses Too

Marios Constantinides, Sanja Šćepanović, Daniele Quercia et al.

Indoor environmental quality has been found to impact employees' productivity in the long run, yet it is unclear its meeting-level impact in the short term. We studied the relationship between sensorial pleasantness of a meeting's room and the meeting's productivity. By administering a 28-item questionnaire to 363 online participants, we indeed found that three factors captured 62% of people's experience of meetings: (a) productivity; (b) psychological safety; and (c) room pleasantness. To measure room pleasantness, we developed and deployed ComFeel, an indoor environmental sensing infrastructure, which captures light, temperature, and gas resistance readings through miniaturized and unobtrusive devices we built and named 'Geckos'. Across 29 real-world meetings, using ComFeel, we collected 1373 minutes of readings. For each of these meetings, we also collected whether each participant felt the meeting to have been productive, the setting to be psychologically safe, and the meeting room to be pleasant. As one expects, we found that, on average, the probability of a meeting being productive increased by 35% for each standard deviation increase in the psychological safety participants experienced. Importantly, that probability increased by as much as 25% for each increase in room pleasantness, confirming the significant short-term impact of the indoor environment on meetings' productivity.

CVJan 28, 2021
Jane Jacobs in the Sky: Predicting Urban Vitality with Open Satellite Data

Sanja Šćepanović, Sagar Joglekar, Stephen Law et al.

The presence of people in an urban area throughout the day -- often called 'urban vitality' -- is one of the qualities world-class cities aspire to the most, yet it is one of the hardest to achieve. Back in the 1970s, Jane Jacobs theorized urban vitality and found that there are four conditions required for the promotion of life in cities: diversity of land use, small block sizes, the mix of economic activities, and concentration of people. To build proxies for those four conditions and ultimately test Jane Jacobs's theory at scale, researchers have had to collect both private and public data from a variety of sources, and that took decades. Here we propose the use of one single source of data, which happens to be publicly available: Sentinel-2 satellite imagery. In particular, since the first two conditions (diversity of land use and small block sizes) are visible to the naked eye from satellite imagery, we tested whether we could automatically extract them with a state-of-the-art deep-learning framework and whether, in the end, the extracted features could predict vitality. In six Italian cities for which we had call data records, we found that our framework is able to explain on average 55% of the variance in urban vitality extracted from those records.

HCOct 13, 2020
Humane Visual AI: Telling the Stories Behind a Medical Condition

Wonyoung So, Edyta P. Bogucka, Sanja Šćepanović et al.

A biological understanding is key for managing medical conditions, yet psychological and social aspects matter too. The main problem is that these two aspects are hard to quantify and inherently difficult to communicate. To quantify psychological aspects, this work mined around half a million Reddit posts in the sub-communities specialised in 14 medical conditions, and it did so with a new deep-learning framework. In so doing, it was able to associate mentions of medical conditions with those of emotions. To then quantify social aspects, this work designed a probabilistic approach that mines open prescription data from the National Health Service in England to compute the prevalence of drug prescriptions, and to relate such a prevalence to census data. To finally visually communicate each medical condition's biological, psychological, and social aspects through storytelling, we designed a narrative-style layered Martini Glass visualization. In a user study involving 52 participants, after interacting with our visualization, a considerable number of them changed their mind on previously held opinions: 10% gave more importance to the psychological aspects of medical conditions, and 27% were more favourable to the use of social media data in healthcare, suggesting the importance of persuasive elements in interactive visualizations.

IVDec 11, 2019
Wide-Area Land Cover Mapping with Sentinel-1 Imagery using Deep Learning Semantic Segmentation Models

Sanja Šćepanović, Oleg Antropov, Pekka Laurila et al.

Land cover mapping is essential to monitoring the environment and understanding the effects of human activities on it. The automatic approaches to land cover mapping (i.e., image segmentation) mostly used traditional machine learning that requires heuristic feature design. On natural images, deep learning has outperformed traditional machine learning approaches for image segmentation. On remote sensing images, recent studies demonstrate successful applications of specific deep learning models to small-scale land cover mapping tasks (e.g., to classify wetland complexes). However, it is not readily clear which of the existing models are the best candidates for which remote sensing task. In this study, we answer that question for mapping the fundamental land cover classes using satellite radar data. We took Sentinel-1 C-band SAR images available at no cost to users as representative data. CORINE land cover map was used as a reference, and the models were trained to distinguish between the 5 major CORINE classes. We selected seven among the state-of-the-art semantic segmentation models so that they cover a diverse set of approaches: U-Net, DeepLabV3+, PSPNet, BiSeNet, SegNet, FC-DenseNet, and FRRN-B. The models were pre-trained on the ImageNet dataset and further fine-tuned in this study. All the models demonstrated solid performance with overall accuracy between 87.9% and 93.1%, and with good to a very good agreement (kappa statistic between 0.75 and 0.86). The two best models were FC-DenseNet and SegNet, with the latter having a much smaller inference time. Overall, our results indicate that the semantic segmentation models are suitable for efficient wide-area mapping using satellite SAR imagery and also provide baseline accuracy against which the newly proposed models should be evaluated.

SOC-PHJun 27, 2016
Semantic homophily in online communication: evidence from Twitter

Sanja Šćepanović, Igor Mishkovski, Bruno Gonçalves et al.

People are observed to assortatively connect on a set of traits. This phenomenon, termed assortative mixing or sometimes homophily, can be quantified through assortativity coefficient in social networks. Uncovering the exact causes of strong assortative mixing found in social networks has been a research challenge. Among the main suggested causes from sociology are the tendency of similar individuals to connect (often itself referred as homophily) and the social influence among already connected individuals. An important question to researchers and in practice can be tackled, as we present here: understanding the exact mechanisms of interplay between these tendencies and the underlying social network structure. Namely, in addition to the mentioned assortativity coefficient, there are several other static and temporal network properties and substructures that can be linked to the tendencies of homophily and social influence in the social network and we herein investigate those. Concretely, we tackle a computer-mediated \textit{communication network} (based on Twitter mentions) and a particular type of assortative mixing that can be inferred from the semantic features of communication content that we term \textit{semantic homophily}. Our work, to the best of our knowledge, is the first to offer an in-depth analysis on semantic homophily in a communication network and the interplay between them. We quantify diverse levels of semantic homophily, identify the semantic aspects that are the drivers of observed homophily, show insights in its temporal evolution and finally, we present its intricate interplay with the communication network on Twitter. By analyzing these mechanisms we increase understanding on what are the semantic aspects that shape and how they shape the human computer-mediated communication.