HCApr 7, 2022
GreaseVision: Rewriting the Rules of the InterfaceSiddhartha Datta, Konrad Kollnig, Nigel Shadbolt
Digital harms can manifest across any interface. Key problems in addressing these harms include the high individuality of harms and the fast-changing nature of digital systems. As a result, we still lack a systematic approach to study harms and produce interventions for end-users. We put forward GreaseVision, a new framework that enables end-users to collaboratively develop interventions against harms in software using a no-code approach and recent advances in few-shot machine learning. The contribution of the framework and tool allow individual end-users to study their usage history and create personalized interventions. Our contribution also enables researchers to study the distribution of harms and interventions at scale.
56.9CYApr 17
Can the GPC standard eliminate consent banners in the EU?Sebastian Zimmeck, Harshvardhan J. Pandit, Frederik Zuiderveen Borgesius et al.
In the EU, the General Data Protection Regulation and the ePrivacy Directive mandate consent for the use of personal data for the purpose of behavioural advertising and tracking technologies. However, the ubiquity of consent banners has led to widespread consent fatigue and questions about the effectiveness of these mechanisms in protecting data subjects' data. To simplify digital laws and make the EU more competitive, the EU Commission recently proposed the Digital Omnibus, introducing a new Article 88b GDPR to express data subjects' choices in a technical way. While the Digital Omnibus is under legislative negotiation, California residents and residents of other US states can already exercise their rights via Global Privacy Control (GPC), a privacy signal to automatically broadcast a legally binding opt-out request to websites. In light of the Digital Omnibus, we evaluate to which extent GPC can be adapted to the EU legal framework to reduce consent banners, mitigate consent fatigue, and improve data protection for EU users. GPC is based on a technical specification, currently being standardised at the World Wide Web Consortium. By sending a GPC signal, data subjects can express their refusal or withdrawal of consent under the GDPR to the use of their personal data for cross-context ad targeting and, in some cases, to express their objection under the GDPR against the use of their data for such purposes. Our evaluation identifies friction between the GPC specification and current EU data protection law. In the longer term, it would be possible for the EU legislator to amend EU laws, as proposed in the current Digital Omnibus, in such a way that internet users can use automated signals to express choices about personal data use and online tracking. In the shorter term, websites and companies who conduct online tracking can already honour GPC.
MMNov 15, 2025
Can LLMs Create Legally Relevant Summaries and Analyses of Videos?Lyra Hoeben-Kuil, Gijs van Dijck, Jaromir Savelka et al.
Understanding the legally relevant factual basis of an event and conveying it through text is a key skill of legal professionals. This skill is important for preparing forms (e.g., insurance claims) or other legal documents (e.g., court claims), but often presents a challenge for laypeople. Current AI approaches aim to bridge this gap, but mostly rely on the user to articulate what has happened in text, which may be challenging for many. Here, we investigate the capability of large language models (LLMs) to understand and summarize events occurring in videos. We ask an LLM to summarize and draft legal letters, based on 120 YouTube videos showing legal issues in various domains. Overall, 71.7\% of the summaries were rated as of high or medium quality, which is a promising result, opening the door to a number of applications in e.g. access to justice.
37.8CRApr 8
Understanding Data Collection, Brokerage, and Spam in the Lead Marketing EcosystemYash Vekaria, Nurullah Demir, Konrad Kollnig et al.
The lead marketing ecosystem enables collection, sale, and use of personal data submitted via web forms to deliver personalized quotes in high-value verticals such as insurance. Despite its scale and sensitivity of the collected data, this ecosystem remains largely unexplored by the research community. We present the first empirical study of privacy and spam risks in lead marketing, developing an end-to-end measurement framework to trace data flows from data collection to consumer contact. Our setup instruments over 100 health-related lead-generation websites and monitors 200 controlled phone numbers and email addresses to understand downstream marketing practices. We observe sharing of highly personal and sensitive health information to more than 70 distinct third parties on these lead generation websites. By purchasing our own and other organic leads from three major lead platforms, we uncover deceptive brokerage practices, where consumer data is sold to unvetted buyers and often augmented or fabricated with attributes such as health status and weight. We received a total of over 8,000 telemarketing phone calls, 600 text messages, and 200 emails, where calls often began within seconds of form submission. Many campaigns relied on VoIP-based neighbor spoofing and high-frequency dialing, at times rendering phones unusable. Our experiments with phone and email opt-outs suggest phone-based opt-outs to help the most, although all were ineffective at completely stopping marketing communications. Analysis of 7,432 Better Business Bureau (BBB) complaints and reviews corroborates these findings from the consumer perspective. Overall, our results reveal a highly interconnected and non-compliant lead marketing ecosystem that aggressively monetizes sensitive consumer data.
62.8CYMar 11
Is your AI Model Accurate Enough? The Difficult Choices Behind Rigorous AI Development and the EU AI ActLucas G. Uberti-Bona Marin, Bram Rijsbosch, Kristof Meding et al.
Technical and legal debates frequently suggest that "accuracy" is an objective, measurable, and purely technical property. We challenge this view, showing that evaluating AI performance fundamentally depends on context-dependent normative decisions. These techno-normative choices are crucial for rigorous AI deployment, as they determine which errors are prioritised, how risks are distributed, and how trade-offs between competing objectives are resolved. This paper provides a legal-technical analysis of the choices that shape how accuracy is defined, measured, and assessed, using the 2024 European Union AI Act -- which mandates an "appropriate level of accuracy" for high-risk systems -- as a primary case study. We identify and analyse four choices central to any robust performance evaluation: (1) selecting metrics, (2) balancing multiple metrics, (3) measuring metrics against representative data, and (4) determining acceptance thresholds. For each choice, we study its relationship to the AI Act's accuracy requirement and associated documentation obligations, show how its technical implementation embeds implicit or explicit assumptions about acceptable risks, errors, and trade-offs, and discuss the implications for the practical implementation of the AI Act by examples and related technical standards. By making the techno-normative dimensions of accuracy explicit, this paper contributes to broader interdisciplinary debates on AI governance and regulation, and offers specific guidance for regulators, auditors, and developers tasked with translating (legal) safety requirements into technical practice.
46.4HCMay 7
Exploring the "Banality" of Deception in Generative AIIshitaa Narwane, Johanna Gunawan, Konrad Kollnig
Current approaches to addressing deceptive design largely focus on visible interface manipulations, commonly referred to as "dark patterns". With the rise of generative AI, deception is becoming more difficult to spot and easier to live with, as it is quietly embedded in default settings, automated suggestions, and conversational interactions rather than discrete interface elements. These subtle, normalised forms of influence, which Simone Natale frames as "banal deception", shape everyday digital use and blur the line between AI-enabled assistance and manipulation. This position paper explores banality as a lens through which to reason through deception in generative AI experiences, especially with chatbots. We explore what Natale describes as users' own involvement in their deception, and argue that this perspective could lead to future work for introducing friction to safeguard users from deception in generative AI interactions, such as empowering users through raising awareness, providing them with intervention tools, and regulatory or enforcement improvements. We present these concepts as points for discussion for the deceptive design scholarly community.
LGJun 20, 2023
Exploring Antitrust and Platform Power in Generative AIKonrad Kollnig, Qian Li
The concentration of power in a few digital technology companies has become a subject of increasing interest in both academic and non-academic discussions. One of the most noteworthy contributions to the debate is Lina Khan's Amazon's Antitrust Paradox. In this work, Khan contends that Amazon has systematically exerted its dominance in online retail to eliminate competitors and subsequently charge above-market prices. This work contributed to Khan's appointment as the chair of the US Federal Trade Commission (FTC), one of the most influential antitrust organisations. Today, several ongoing antitrust lawsuits in the US and Europe involve major technology companies like Apple, Google/Alphabet, and Facebook/Meta. In the realm of generative AI, we are once again witnessing the same companies taking the lead in technological advancements, leaving little room for others to compete. This article examines the market dominance of these corporations in the technology stack behind generative AI from an antitrust law perspective.
NAOct 21, 2017
Constrained Optimisation of Rational Functions for Accelerating Subspace IterationKonrad Kollnig
Earlier this decade, the so-called FEAST algorithm was released for computing the eigenvalues of a matrix in a given interval. Previously, rational filter functions have been examined as a parameter of FEAST. In this thesis, we expand on existing work with the following contributions: (i) Obtaining well-performing rational filter functions via standard minimisation algorithms, (ii) Obtaining constrained rational filter functions efficiently, and (iii) Improving existing rational filter functions algorithmically. Using our new rational filter functions, FEAST requires up to one quarter fewer iterations on average compared to state-of-art rational filter functions.
CYAug 26, 2025
Are Companies Taking AI Risks Seriously? A Systematic Analysis of Companies' AI Risk Disclosures in SEC 10-K formsLucas G. Uberti-Bona Marin, Bram Rijsbosch, Gerasimos Spanakis et al.
As Artificial Intelligence becomes increasingly central to corporate strategies, concerns over its risks are growing too. In response, regulators are pushing for greater transparency in how companies identify, report and mitigate AI-related risks. In the US, the Securities and Exchange Commission (SEC) repeatedly warned companies to provide their investors with more accurate disclosures of AI-related risks; recent enforcement and litigation against companies' misleading AI claims reinforce these warnings. In the EU, new laws - like the AI Act and Digital Services Act - introduced additional rules on AI risk reporting and mitigation. Given these developments, it is essential to examine if and how companies report AI-related risks to the public. This study presents the first large-scale systematic analysis of AI risk disclosures in SEC 10-K filings, which require public companies to report material risks to their company. We analyse over 30,000 filings from more than 7,000 companies over the past five years, combining quantitative and qualitative analysis. Our findings reveal a sharp increase in the companies that mention AI risk, up from 4% in 2020 to over 43% in the most recent 2024 filings. While legal and competitive AI risks are the most frequently mentioned, we also find growing attention to societal AI risks, such as cyberattacks, fraud, and technical limitations of AI systems. However, many disclosures remain generic or lack details on mitigation strategies, echoing concerns raised recently by the SEC about the quality of AI-related risk reporting. To support future research, we publicly release a web-based tool for easily extracting and analysing keyword-based disclosures across SEC filings.
CYMar 23, 2025
Adoption of Watermarking for Generative AI Systems in Practice and Implications under the new EU AI ActBram Rijsbosch, Gijs van Dijck, Konrad Kollnig
AI-generated images have become so good in recent years that individuals often cannot distinguish them any more from "real" images. This development, combined with the rapid spread of AI-generated content online, creates a series of societal risks. Watermarking, a technique that involves embedding information within images and other content to indicate their AI-generated nature, has emerged as a primary mechanism to address the risks posed by AI-generated content. Indeed, watermarking and AI labelling measures are now becoming a legal requirement in many jurisdictions, including under the 2024 European Union AI Act. Despite the widespread use of AI image generation systems, the practical implications and the current status of implementation of these measures remain largely unexamined. The present paper therefore provides both an empirical and a legal analysis of these measures. In our legal analysis, we identify four categories of generative AI deployment scenarios and outline how the legal obligations could apply in each category. In our empirical analysis, we find that only a minority number of AI image generators currently implement adequate watermarking (38%) and deep fake labelling (18%) practices. In response, we suggest a range of avenues of how the implementation of these legally mandated techniques can be improved, and publicly share our tooling for the detection of watermarks in images.
HCDec 20, 2021
Mind-proofing Your Phone: Navigating the Digital Minefield with GreaseTerminatorSiddhartha Datta, Konrad Kollnig, Nigel Shadbolt
Digital harms are widespread in the mobile ecosystem. As these devices gain ever more prominence in our daily lives, so too increases the potential for malicious attacks against individuals. The last line of defense against a range of digital harms - including digital distraction, political polarisation through hate speech, and children being exposed to damaging material - is the user interface. This work introduces GreaseTerminator to enable researchers to develop, deploy, and test interventions against these harms with end-users. We demonstrate the ease of intervention development and deployment, as well as the broad range of harms potentially covered with GreaseTerminator in five in-depth case studies.
CRNov 15, 2021
Tracking in apps' privacy policiesKonrad Kollnig
Data protection law, including the General Data Protection Regulation (GDPR), usually requires a privacy policy before data can be collected from individuals. We analysed 15,145 privacy policies from 26,910 mobile apps in May 2019 (about one year after the GDPR came into force), finding that only opening the policy webpages shares data with third-parties for 48.5% of policies, potentially violating the GDPR. We compare this data sharing across countries, payment models (free, in-app-purchases, paid) and platforms (Google Play Store, Apple App Store). We further contacted 52 developers of apps, which did not provide a privacy policy, and asked them about their data practices. Despite being legally required to answer such queries, 12 developers (23%) failed to respond.
CRSep 28, 2021
Are iPhones Really Better for Privacy? Comparative Study of iOS and Android AppsKonrad Kollnig, Anastasia Shuba, Reuben Binns et al.
While many studies have looked at privacy properties of the Android and Google Play app ecosystem, comparatively much less is known about iOS and the Apple App Store, the most widely used ecosystem in the US. At the same time, there is increasing competition around privacy between these smartphone operating system providers. In this paper, we present a study of 24k Android and iOS apps from 2020 along several dimensions relating to user privacy. We find that third-party tracking and the sharing of unique user identifiers was widespread in apps from both ecosystems, even in apps aimed at children. In the children's category, iOS apps tended to use fewer advertising-related tracking than their Android counterparts, but could more often access children's location. Across all studied apps, our study highlights widespread potential violations of US, EU and UK privacy law, including 1) the use of third-party tracking without user consent, 2) the lack of parental consent before sharing personally identifiable information (PII) with third-parties in children's apps, 3) the non-data-minimising configuration of tracking libraries, 4) the sending of personal data to countries without an adequate level of data protection, and 5) the continued absence of transparency around tracking, partly due to design decisions by Apple and Google. Overall, we find that neither platform is clearly better than the other for privacy across the dimensions we studied.
HCFeb 23, 2021
I Want My App That Way: Reclaiming Sovereignty Over Personal DevicesKonrad Kollnig, Siddhartha Datta, Max Van Kleek
Dark patterns in mobile apps take advantage of cognitive biases of end-users and can have detrimental effects on people's lives. Despite growing research in identifying remedies for dark patterns and established solutions for desktop browsers, there exists no established methodology to reduce dark patterns in mobile apps. Our work introduces GreaseDroid, a community-driven app modification framework enabling non-expert users to disable dark patterns in apps selectively.