Sander Noels

CL
h-index37
5papers
134citations
Novelty44%
AI Score39

5 Papers

CLMar 12Code
VIGIL: An Extensible System for Real-Time Detection and Mitigation of Cognitive Bias Triggers

Bo Kang, Sander Noels, Tijl De Bie

The rise of generative AI is posing increasing risks to online information integrity and civic discourse. Most concretely, such risks can materialise in the form of mis- and disinformation. As a mitigation, media-literacy and transparency tools have been developed to address factuality of information and the reliability and ideological leaning of information sources. However, a subtler but possibly no less harmful threat to civic discourse is to use of persuasion or manipulation by exploiting human cognitive biases and related cognitive limitations. To the best of our knowledge, no tools exist to directly detect and mitigate the presence of triggers of such cognitive biases in online information. We present VIGIL (VIrtual GuardIan angeL), the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. VIGIL is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks are already included. It is open-sourced at https://github.com/aida-ugent/vigil.

CLOct 24, 2024
Large Language Models Reflect the Ideology of their Creators

Maarten Buyl, Alexander Rogiers, Sander Noels et al.

Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in artificial intelligence (AI) assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, training, and use. In this paper, we prompt a diverse panel of popular LLMs to describe a large number of prominent personalities with political relevance, in all six official languages of the United Nations. By identifying and analyzing moral assessments reflected in their responses, we find normative differences between LLMs from different geopolitical regions, as well as between the responses of the same LLM when prompted in different languages. Among only models in the United States, we find that popularly hypothesized disparities in political views are reflected in significant normative differences related to progressive values. Among Chinese models, we characterize a division between internationally- and domestically-focused models. Our results show that the ideological stance of an LLM appears to reflect the worldview of its creators. This poses the risk of political instrumentalization and raises concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically 'unbiased'.

CLNov 11, 2024
Persuasion with Large Language Models: a Survey

Alexander Rogiers, Sander Noels, Maarten Buyl et al.

The rapid rise of Large Language Models (LLMs) has created new disruptive possibilities for persuasive communication, by enabling fully-automated personalized and interactive content generation at an unprecedented scale. In this paper, we survey the research field of LLM-based persuasion that has emerged as a result. We begin by exploring the different modes in which LLM Systems are used to influence human attitudes and behaviors. In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness. We identify key factors influencing their effectiveness, such as the manner of personalization and whether the content is labelled as AI-generated. We also summarize the experimental designs that have been used to evaluate progress. Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks, including the spread of misinformation, the magnification of biases, and the invasion of privacy. These risks underscore the urgent need for ethical guidelines and updated regulatory frameworks to avoid the widespread deployment of irresponsible and harmful LLM Systems.

CLApr 4, 2025
What Large Language Models Do Not Talk About: An Empirical Study of Moderation and Censorship Practices

Sander Noels, Guillaume Bied, Maarten Buyl et al.

Large Language Models (LLMs) are increasingly deployed as gateways to information, yet their content moderation practices remain underexplored. This work investigates the extent to which LLMs refuse to answer or omit information when prompted on political topics. To do so, we distinguish between hard censorship (i.e., generated refusals, error messages, or canned denial responses) and soft censorship (i.e., selective omission or downplaying of key elements), which we identify in LLMs' responses when asked to provide information on a broad range of political figures. Our analysis covers 14 state-of-the-art models from Western countries, China, and Russia, prompted in all six official United Nations (UN) languages. Our analysis suggests that although censorship is observed across the board, it is predominantly tailored to an LLM provider's domestic audience and typically manifests as either hard censorship or soft censorship (though rarely both concurrently). These findings underscore the need for ideological and geographic diversity among publicly available LLMs, and greater transparency in LLM moderation strategies to facilitate informed user choices. All data are made freely available.

CEApr 19, 2024
TopoLedgerBERT: Topological Learning of Ledger Description Embeddings using Siamese BERT-Networks

Sander Noels, Sébastien Viaene, Tijl De Bie

This paper addresses a long-standing problem in the field of accounting: mapping company-specific ledger accounts to a standardized chart of accounts. We propose a novel solution, TopoLedgerBERT, a unique sentence embedding method devised specifically for ledger account mapping. This model integrates hierarchical information from the charts of accounts into the sentence embedding process, aiming to accurately capture both the semantic similarity and the hierarchical structure of the ledger accounts. In addition, we introduce a data augmentation strategy that enriches the training data and, as a result, increases the performance of our proposed model. Compared to benchmark methods, TopoLedgerBERT demonstrates superior performance in terms of accuracy and mean reciprocal rank.