Stefan Bechtold

h-index13

4papers

798citations

Novelty45%

AI Score44

Ranked #47,653 of 194,257 authors (top 25%)#9,506 in CL (top 31%)

4 Papers

6.1CLJul 23, 2024Code

Lawma: The Power of Specialization for Legal Annotation

Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.

Annotation and classification of legal text are central components of empirical legal research. Traditionally, these tasks are often delegated to trained research assistants. Motivated by the advances in language modeling, empirical legal scholars are increasingly turning to prompting commercial models, hoping that it will alleviate the significant cost of human annotation. Despite growing use, our understanding of how to best utilize large language models for legal annotation remains limited. To bridge this gap, we introduce CaselawQA, a benchmark comprising 260 legal annotation tasks, nearly all new to the machine learning community. We demonstrate that commercial models, such as GPT-4.5 and Claude 3.7 Sonnet, achieve non-trivial yet highly variable accuracy, generally falling short of the performance required for legal work. We then demonstrate that small, lightly fine-tuned models outperform commercial models. A few hundred to a thousand labeled examples are usually enough to achieve higher accuracy. Our work points to a viable alternative to the predominant practice of prompting commercial models. For concrete legal annotation tasks with some available labeled data, researchers are likely better off using a fine-tuned open-source model.

14.8CYJul 16

Do Generative AI Assistants Respect robots.txt? Tracing Web Access Beyond Visible Answers

Gabriel Lopez-Fonseca, David Rodriguez, Stefan Bechtold et al.

AI assistants increasingly retrieve web content at inference time to provide fresh and grounded answers, yet it remains unclear whether these search-augmented capabilities respect website-owner restrictions expressed through robots.txt. We present a controlled empirical study of ten widely used AI assistants with advertised web-search capabilities. For each assistant, we first identify a configuration that actually produces observable web-browsing behavior and record the user-agent exposed during retrieval. We then evaluate compliance with controlled robots.txt rules across four complementary conditions: allowed for all user-agents, disallowed for all user-agents, allowed only for the assistant-specific user-agent, and disallowed only for that user-agent. Using server-side logs and secret codes embedded in target pages, we distinguish actual page access from user-visible answer correctness across 200 trials. Our results show substantial variation across assistants. Some systems followed the expected allowed/disallowed access pattern, whereas others accessed restricted resources without requesting robots.txt or used generic user-agents that complicated attribution. We also find that retrieval behavior and answer correctness can diverge: assistants may access pages without surfacing the retrieved content, or fail to access even allowed resources. These findings raise broader legal and governance concerns about whether AI-assisted web access adequately respects content owners' rights and restrictions. Furthermore, our observations provide valuable insight into the growing erosion of traditional web governance protocols, highlighting the urgent need for updated, enforceable standards that guarantee publisher autonomy in the age of search-augmented AI assistants.

5.9CRJun 4

Credential Disclosure in (EU) Digital Identity Wallets: Privacy Risks and Practical Mitigations

Sheila Zingg, Daniele Lain, Yoshimichi Nakatsuka et al.

The European Union will introduce the EUDI Wallet by late 2026, which allows users to hold digital credentials (i.e., representations of physical official identity documents) on their devices. This will allow users to securely and privately disclose identity attributes to websites. Although such a system has many benefits, it also introduces risks caused by poor credential disclosure decisions. In this paper, we (i) conduct a large-scale survey on credential disclosure with users and experts and (ii) evaluate the effectiveness and feasibility of our Credential Assistant that displays expert recommendations and user opinions. Our results show that users are likely to overshare (e.g., ~20% of users disclosed their official ID to news websites). This indicates that users struggle to protect their privacy, which will impact the usability of the EUDI Wallet and lead to privacy violations, identity theft, and other abuses of leaked credentials. Finally, we show that our Credential Assistant significantly reduces users' credential disclosure mistakes from ~15% to ~7%. However, it does not fully eliminate poor credential disclosure decisions, indicating that stronger interventions may be necessary, especially for sensitive attributes.

7.2CYMar 20

Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube

Sai Keerthana Karnam, Abhisek Dash, Antariksh Das et al.

The GDPR's Right of Access aims to empower users with control over their personal data via Data Download Packages (DDPs). However, their effectiveness is often compromised by inconsistent platform implementations, questionable data reliability, and poor user comprehensibility. This paper conducts a comprehensive audit of DDPs from three social media platforms (TikTok, Instagram, and YouTube) to systematically assess these critical drawbacks. Despite offering similar services, we find that these platforms demonstrate significant inconsistencies in implementing the Right of Access, evident in varying levels of shared data. Critically, the failure to disclose processing purposes, retention periods, and other third-party data recipients serves as a further indicator of non-compliance. Our reliability evaluations, using bots and user-donated data, reveal that while TikTok's DDPs offer more consistent and complete data, others exhibit notable shortcomings. Similarly, our assessment of comprehensibility, based on surveys with 400 participants, indicates that current DDPs substantially fall short of GDPR's standards. To improve the comprehensibility, we propose and demonstrate a two-layered approach by: (1)~enhancing the data representation itself using stakeholder interpretations; and (2)~incorporating a user-friendly extension (\textit{Know Your Data}) for intuitive data visualization where users can control the level of transparency they prefer. Our findings underscore the need for clearer and non-conflicting regulatory guidance, stricter enforcement, and platform commitment to realize the goal of GDPR's Right of Access.