Michel Cukier

h-index3

4papers

4citations

Novelty24%

AI Score38

Ranked #106,554 of 205,806 authors (top 52%)#19,257 in CL (top 59%)

4 Papers

70.4CRMay 27Code

S3C2 Summit 2025-07: Government Secure Supply Chain Summit

Sivana Hamer, Pat Morrison, William Enck et al.

Software supply chains, while providing immense economic and software development value, are only as strong as their weakest link. Over the past several years, there has been an exponential increase in cyberattacks specifically targeting vulnerable links in critical software supply chains. The attacks disrupt day-to-day functioning and threaten the security of nearly everyone on the internet, from billion-dollar companies and government agencies to hobbyist open-source developers. The evolving threat of software supply chain attacks has garnered interest from both the software industry and governments worldwide in improving software supply chain security. On Thursday, July 9th, 2025, 3 researchers from the NSF-backed Secure Software Supply Chain Center (S3C2) conducted a Secure Software Supply Chain Summit with a diverse set of 12 participants from 6 US government agencies. The goals of the Summit were: (1) to enable sharing between participants from different industries regarding practical experiences and challenges with software supply chain security; (2) to help form new collaborations; and (3) to learn about the challenges facing participants to inform our future research directions. The summit consisted of discussions of six topics relevant to the government agencies represented, including software bill of materials (SBOMs); compliance; malicious commits; build infrastructure; culture; and large language models (LLMs) and security. For each topic of discussion, we presented participants with a list of questions to spark conversation and an overview of the discussions of two industry summit held in the past year. In this report, we provide a summary of the summit. The initial discussion questions for each topic are provided in the appendi

23.0CLApr 17Code

Can LLMs Understand the Impact of Trauma? Costs and Benefits of LLMs Coding the Interviews of Firearm Violence Survivors

Jessica H. Zhu, Shayla Stringfield, Vahe Zaprosyan et al.

Firearm violence is a pressing public health issue, yet research into survivors' lived experiences remains underfunded and difficult to scale. Qualitative research, including in-depth interviews, is a valuable tool for understanding the personal and societal consequences of community firearm violence and designing effective interventions. However, manually analyzing these narratives through thematic analysis and inductive coding is time-consuming and labor-intensive. Recent advancements in large language models (LLMs) have opened the door to automating this process, though concerns remain about whether these models can accurately and ethically capture the experiences of vulnerable populations. In this study, we assess the use of open-source LLMs to inductively code interviews with 21 Black men who have survived community firearm violence. Our results demonstrate that while some configurations of LLMs can identify important codes, overall relevance remains low and is highly sensitive to data processing. Furthermore, LLM guardrails lead to substantial narrative erasure. These findings highlight both the potential and limitations of LLM-assisted qualitative coding and underscore the ethical challenges of applying AI in research involving marginalized communities.

AIFeb 14, 2024

Nutrition Facts, Drug Facts, and Model Facts: Putting AI Ethics into Practice in Gun Violence Research

Jessica Zhu, Michel Cukier, Joseph Richardson

Objective: Firearm injury research necessitates using data from often-exploited vulnerable populations of Black and Brown Americans. In order to minimize distrust, this study provides a framework for establishing AI trust and transparency with the general population. Methods: We propose a Model Facts template that is easily extendable and decomposes accuracy and demographics into standardized and minimally complex values. This framework allows general users to assess the validity and biases of a model without diving into technical model documentation. Examples: We apply the Model Facts template on two previously published models, a violence risk identification model and a suicide risk prediction model. We demonstrate the ease of accessing the appropriate information when the data is structured appropriately. Discussion: The Model Facts template is limited in its current form to human based data and biases. Like nutrition facts, it also will require some educational resources for users to grasp its full utility. Human computer interaction experiments should be conducted to ensure that the interaction between user interface and model interface is as desired. Conclusion: The Model Facts label is the first framework dedicated to establishing trust with end users and general population consumers. Implementation of Model Facts into firearm injury research will provide public health practitioners and those impacted by firearm injury greater faith in the tools the research provides.

CLJun 16, 2024

DocNet: Semantic Structure in Inductive Bias Detection Models

Jessica Zhu, Iain Cruickshank, Michel Cukier

News will be biased so long as people have opinions. As social media becomes the primary entry point for news and partisan differences increase, it is increasingly important for informed citizens to be able to recognize bias. If people are aware of the biases of the news they consume, they will be able to take action to avoid polarizing echo chambers. In this paper, we explore an often overlooked aspect of bias detection in media: the semantic structure of news articles. We present DocNet, a novel, inductive, and low-resource document embedding and political bias detection model. We also demonstrate that the semantic structure of news articles from opposing political sides, as represented in document-level graph embeddings, have significant similarities. DocNet bypasses the need for pre-trained language models, reducing resource dependency while achieving comparable performance. It can be used to advance political bias detection in low-resource environments. Our code and data are made available at: https://anonymous.4open.science/r/DocNet/