José M. Such

h-index49

21papers

620citations

Novelty35%

AI Score46

Ranked #36,622 of 194,257 authors (top 19%)#744 in CR (top 11%)

21 Papers

7.7LGFeb 21, 2023

MalProtect: Stateful Defense Against Adversarial Query Attacks in ML-based Malware Detection

Aqib Rashid, Jose Such

ML models are known to be vulnerable to adversarial query attacks. In these attacks, queries are iteratively perturbed towards a particular class without any knowledge of the target model besides its output. The prevalence of remotely-hosted ML classification models and Machine-Learning-as-a-Service platforms means that query attacks pose a real threat to the security of these systems. To deal with this, stateful defenses have been proposed to detect query attacks and prevent the generation of adversarial examples by monitoring and analyzing the sequence of queries received by the system. Several stateful defenses have been proposed in recent years. However, these defenses rely solely on similarity or out-of-distribution detection methods that may be effective in other domains. In the malware detection domain, the methods to generate adversarial examples are inherently different, and therefore we find that such detection mechanisms are significantly less effective. Hence, in this paper, we present MalProtect, which is a stateful defense against query attacks in the malware detection domain. MalProtect uses several threat indicators to detect attacks. Our results show that it reduces the evasion rate of adversarial query attacks by 80+\% in Android and Windows malware, across a range of attacker scenarios. In the first evaluation of its kind, we show that MalProtect outperforms prior stateful defenses, especially under the peak adversarial threat.

2.0LGFeb 1, 2023

Effectiveness of Moving Target Defenses for Adversarial Attacks in ML-based Malware Detection

Aqib Rashid, Jose Such

Several moving target defenses (MTDs) to counter adversarial ML attacks have been proposed in recent years. MTDs claim to increase the difficulty for the attacker in conducting attacks by regularly changing certain elements of the defense, such as cycling through configurations. To examine these claims, we study for the first time the effectiveness of several recent MTDs for adversarial ML attacks applied to the malware detection domain. Under different threat models, we show that transferability and query attack strategies can achieve high levels of evasion against these defenses through existing and novel attack strategies across Android and Windows. We also show that fingerprinting and reconnaissance are possible and demonstrate how attackers may obtain critical defense hyperparameters as well as information about how predictions are produced. Based on our findings, we present key recommendations for future work on the development of effective MTDs for adversarial attacks in ML-based malware detection.

5.9CYJun 13, 2025

Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

Xiao Zhan, Juan Carlos Carrillo, William Seymour et al.

LLM-based Conversational AIs (CAIs), also known as GenAI chatbots, like ChatGPT, are increasingly used across various domains, but they pose privacy risks, as users may disclose personal information during their conversations with CAIs. Recent research has demonstrated that LLM-based CAIs could be used for malicious purposes. However, a novel and particularly concerning type of malicious LLM application remains unexplored: an LLM-based CAI that is deliberately designed to extract personal information from users. In this paper, we report on the malicious LLM-based CAIs that we created based on system prompts that used different strategies to encourage disclosures of personal information from users. We systematically investigate CAIs' ability to extract personal information from users during conversations by conducting a randomized-controlled trial with 502 participants. We assess the effectiveness of different malicious and benign CAIs to extract personal information from participants, and we analyze participants' perceptions after their interactions with the CAIs. Our findings reveal that malicious CAIs extract significantly more personal information than benign CAIs, with strategies based on the social nature of privacy being the most effective while minimizing perceived risks. This study underscores the privacy threats posed by this novel type of malicious LLM-based CAIs and provides actionable recommendations to guide future research and practice.

8.1HCMar 10

Privacy and Safety Experiences and Concerns of U.S. Women Using Generative AI for Seeking Sexual and Reproductive Health Information

Ina Kaleva, Xiao Zhan, Ruba Abu-Salma et al.

The rapid adoption of generative AI (GenAI) chatbots has reshaped access to sexual and reproductive health (SRH) information, particularly following the overturning of Roe v. Wade, as individuals assigned female at birth increasingly turn to online sources. However, existing research remains largely model-centered, paying limited attention to user privacy and safety. We conducted semi-structured interviews with 18 U.S.-based participants from both restrictive and non-restrictive states who had used GenAI chatbots to seek SRH information. Adoption was influenced by perceived utility, usability, credibility, accessibility, and anthropomorphism, and many participants disclosed sensitive personal SRH details. Participants identified multiple privacy risks, including excessive data collection, government surveillance, profiling, model training, and data commodification. While most participants accepted these risks in exchange for perceived utility, abortion-related queries elicited heightened safety concerns. Few participants employed protective strategies beyond minimizing disclosures or deleting data. Based on these findings, we offer design and policy recommendations, such as health-specific features and stronger moderation practices, to enhance privacy and safety in GenAI-supported SRH information seeking.

14.3HCJan 23

Privacy in Human-AI Romantic Relationships: Concerns, Boundaries, and Agency

Rongjun Ma, Shijing He, Jose Luis Martin-Navarro et al.

An increasing number of LLM-based applications are being developed to facilitate romantic relationships with AI partners, yet the safety and privacy risks in these partnerships remain largely underexplored. In this work, we investigate privacy in human-AI romantic relationships through an interview study (N=17), examining participants' experiences and privacy perceptions across the three stages of exploration, intimacy, and dissolution, alongside an analysis of the platforms they used. We found that these relationships took varied forms, from one-to-one to one-to-many, and were shaped by multiple actors, including creators, platforms, and moderators. AI partners were perceived as having agency, actively negotiating privacy boundaries with participants and sometimes encouraging disclosure of personal details. As intimacy deepened, these boundaries became more permeable, though some participants expressed concerns such as conversation exposure and sought to preserve anonymity. Overall, AI platform affordances and diverse relational dynamics expand the privacy landscape, underscoring the need to rethink how privacy is constructed in human-AI romantic relationships.

10.9CLFeb 3, 2025

Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs

David Rodriguez, William Seymour, Jose M. Del Alamo et al.

Large Language Models (LLMs) have gained unprecedented prominence, achieving widespread adoption across diverse domains and integrating deeply into society. The capability to fine-tune general-purpose LLMs, such as Generative Pre-trained Transformers (GPT), for specific tasks has facilitated the emergence of numerous Custom GPTs. These tailored models are increasingly made available through dedicated marketplaces, such as OpenAI's GPT Store. However, their black-box nature introduces significant safety and compliance risks. In this work, we present a scalable framework for the automated evaluation of Custom GPTs against OpenAI's usage policies, which define the permissible behaviors of these systems. Our framework integrates three core components: (1) automated discovery and data collection of models from the GPT store, (2) a red-teaming prompt generator tailored to specific policy categories and the characteristics of each target GPT, and (3) an LLM-as-a-judge technique to analyze each prompt-response pair for potential policy violations. We validate our framework with a manually annotated ground truth, and evaluate it through a large-scale study with 782 Custom GPTs across three categories: Romantic, Cybersecurity, and Academic GPTs. Our manual annotation process achieved an F1 score of 0.975 in identifying policy violations, confirming the reliability of the framework's assessments. The results reveal that 58.7% of the analyzed models exhibit indications of non-compliance, exposing weaknesses in the GPT store's review and approval processes. Furthermore, our findings indicate that a model's popularity does not correlate with compliance, and non-compliance issues largely stem from behaviors inherited from base models rather than user-driven customizations. We believe this approach is extendable to other chatbot platforms and policy domains, improving LLM-based systems safety.

2.1AIDec 18, 2023

Moral Uncertainty and the Problem of Fanaticism

Jazon Szabo, Jose Such, Natalia Criado et al.

While there is universal agreement that agents ought to act ethically, there is no agreement as to what constitutes ethical behaviour. To address this problem, recent philosophical approaches to `moral uncertainty' propose aggregation of multiple ethical theories to guide agent behaviour. However, one of the foundational proposals for aggregation - Maximising Expected Choiceworthiness (MEC) - has been criticised as being vulnerable to fanaticism; the problem of an ethical theory dominating agent behaviour despite low credence (confidence) in said theory. Fanaticism thus undermines the `democratic' motivation for accommodating multiple ethical perspectives. The problem of fanaticism has not yet been mathematically defined. Representing moral uncertainty as an instance of social welfare aggregation, this paper contributes to the field of moral uncertainty by 1) formalising the problem of fanaticism as a property of social welfare functionals and 2) providing non-fanatical alternatives to MEC, i.e. Highest k-trimmed Mean and Highest Median.

1.2SIApr 2, 2024

A Holistic Indicator of Polarization to Measure Online Sexism

Vahid Ghafouri, Jose Such, Guillermo Suarez-Tangil

The online trend of the manosphere and feminist discourse on social networks requires a holistic measure of the level of sexism in an online community. This indicator is important for policymakers and moderators of online communities (e.g., subreddits) and computational social scientists, either to revise moderation strategies based on the degree of sexism or to match and compare the temporal sexism across different platforms and communities with real-time events and infer social scientific insights. In this paper, we build a model that can provide a comparable holistic indicator of toxicity targeted toward male and female identity and male and female individuals. Despite previous supervised NLP methods that require annotation of toxic comments at the target level (e.g. annotating comments that are specifically toxic toward women) to detect targeted toxic comments, our indicator uses supervised NLP to detect the presence of toxicity and unsupervised word embedding association test to detect the target automatically. We apply our model to gender discourse communities (e.g., r/TheRedPill, r/MGTOW, r/FemaleDatingStrategy) to detect the level of toxicity toward genders (i.e., sexism). Our results show that our framework accurately and consistently (93% correlation) measures the level of sexism in a community. We finally discuss how our framework can be generalized in the future to measure qualities other than toxicity (e.g. sentiment, humor) toward general-purpose targets and turn into an indicator of different sorts of polarizations.

4.6LGFeb 15, 2022

StratDef: Strategic Defense Against Adversarial Attacks in ML-based Malware Detection

Aqib Rashid, Jose Such

Over the years, most research towards defenses against adversarial attacks on machine learning models has been in the image recognition domain. The ML-based malware detection domain has received less attention despite its importance. Moreover, most work exploring these defenses has focused on several methods but with no strategy when applying them. In this paper, we introduce StratDef, which is a strategic defense system based on a moving target defense approach. We overcome challenges related to the systematic construction, selection, and strategic use of models to maximize adversarial robustness. StratDef dynamically and strategically chooses the best models to increase the uncertainty for the attacker while minimizing critical aspects in the adversarial ML domain, like attack transferability. We provide the first comprehensive evaluation of defenses against adversarial attacks on machine learning for malware detection, where our threat model explores different levels of threat, attacker knowledge, capabilities, and attack intensities. We show that StratDef performs better than other defenses even when facing the peak adversarial threat. We also show that, of the existing defenses, only a few adversarially-trained models provide substantially better protection than just using vanilla models but are still outperformed by StratDef.

3.8CRMar 12, 2021

Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications

Danny S. Guamán, Xavier Ferrer, Jose M. del Alamo et al.

The General Data Protection Regulation (GDPR) aims to ensure that all personal data processing activities are fair and transparent for the European Union (EU) citizens, regardless of whether these are carried out within the EU or anywhere else. To this end, it sets strict requirements to transfer personal data outside the EU. However, checking these requirements is a daunting task for supervisory authorities, particularly in the mobile app domain due to the huge number of apps available and their dynamic nature. In this paper, we propose a fully automated method to assess the compliance of mobile apps with the GDPR requirements for cross-border personal data transfers. We have applied the method to the top-free 10,080 apps from the Google Play Store. The results reveal that there is still a very significant gap between what app providers and third-party recipients do in practice and what is intended by the GDPR. A substantial 56% of analysed apps are potentially non-compliant with the GDPR cross-border transfer requirements.

8.8CRMar 3, 2021Code

SkillVet: Automated Traceability Analysis of Amazon Alexa Skills

Jide S Edu, Xavier Ferrer-Aran, Jose M Such et al.

Third-party software, or skills, are essential components in Smart Personal Assistants (SPA). The number of skills has grown rapidly, dominated by a changing environment that has no clear business model. Skills can access personal information and this may pose a risk to users. However, there is little information about how this ecosystem works, let alone the tools that can facilitate its study. In this paper, we present the largest systematic measurement of the Amazon Alexa skill ecosystem to date. We study developers' practices in this ecosystem, including how they collect and justify the need for sensitive information, by designing a methodology to identify over-privileged skills with broken privacy policies. We collect 199,295 Alexa skills and uncover that around 43% of the skills (and 50% of the developers) that request these permissions follow bad privacy practices, including (partially) broken data permissions traceability. In order to perform this kind of analysis at scale, we present SkillVet that leverages machine learning and natural language processing techniques, and generates high-accuracy prediction sets. We report a number of concerning practices including how developers can bypass Alexa's permission system through account linking and conversational skills, and offer recommendations on how to improve transparency, privacy and security. Resulting from the responsible disclosure we have conducted, 13% of the reported issues no longer pose a threat at submission time.

2.9CRDec 7, 2020

The Challenges with Internet of Things for Business

Ievgeniia Kuzminykh, Bogdan Ghita, Jose M. Such

Many companies consider IoT as a central element for increasing competitiveness. Despite the growing number of cyberattacks on IoT devices and the importance of IoT security, no study has yet primarily focused on the impact of IoT security measures on the security challenges. This paper presents a review of the current state of security of IoT in companies that produce IoT products and have begun a transformation towards the digitalization of their products and the associated production processes. The analysis of challenges in IoT security was conducted based on the review of resources and reports on IoT security, while mapping the relevant solutions/measures for strengthening security to the existing challenges. This mapping assists stakeholders in understanding the IoT security initiatives regarding their business needs and issues. Based on the analysis, we conclude that almost all companies have an understanding of basic security measures as encryption, but do not understand threat surface and not aware of advanced methods of protecting data and devices. The analysis shows that most companies do not have internal experts in IoT security and prefer to outsource security operations to security providers.

0.5CLOct 27, 2020Code

Discovering and Interpreting Biased Concepts in Online Communities

Xavier Ferrer-Aran, Tom van Nuenen, Natalia Criado et al.

Language carries implicit human biases, functioning both as a reflection and a perpetuation of stereotypes that people carry with them. Recently, ML-based NLP methods such as word embeddings have been shown to learn such language biases with striking accuracy. This capability of word embeddings has been successfully exploited as a tool to quantify and study human biases. However, previous studies only consider a predefined set of biased concepts to attest (e.g., whether gender is more or less associated with particular jobs), or just discover biased words without helping to understand their meaning at the conceptual level. As such, these approaches can be either unable to find biased concepts that have not been defined in advance, or the biases they find are difficult to interpret and study. This could make existing approaches unsuitable to discover and interpret biases in online communities, as such communities may carry different biases than those in mainstream culture. This paper improves upon, extends, and evaluates our previous data-driven method to automatically discover and help interpret biased concepts encoded in word embeddings. We apply this approach to study the biased concepts present in the language used in online communities and experimentally show the validity and stability of our method

13.0CYAug 11, 2020

Bias and Discrimination in AI: a cross-disciplinary perspective

Xavier Ferrer, Tom van Nuenen, Jose M. Such et al.

With the widespread and pervasive use of Artificial Intelligence (AI) for automated decision-making systems, AI bias is becoming more apparent and problematic. One of its negative consequences is discrimination: the unfair, or unequal treatment of individuals based on certain characteristics. However, the relationship between bias and discrimination is not always clear. In this paper, we survey relevant literature about bias and discrimination in AI from an interdisciplinary perspective that embeds technical, legal, social and ethical dimensions. We show that finding solutions to bias and discrimination in AI requires robust cross-disciplinary collaborations.

2.7CLAug 6, 2020Code

Discovering and Categorising Language Biases in Reddit

Xavier Ferrer, Tom van Nuenen, Jose M. Such et al.

We present a data-driven approach using word embeddings to discover and categorise language biases on the discussion platform Reddit. As spaces for isolated user communities, platforms such as Reddit are increasingly connected to issues of racism, sexism and other forms of discrimination. Hence, there is a need to monitor the language of these groups. One of the most promising AI approaches to trace linguistic biases in large textual datasets involves word embeddings, which transform text into high-dimensional dense vectors and capture semantic relations between words. Yet, previous studies require predefined sets of potential biases to study, e.g., whether gender is more or less associated with particular types of jobs. This makes these approaches unfit to deal with smaller and community-centric datasets such as those on Reddit, which contain smaller vocabularies and slang, as well as biases that may be particular to that community. This paper proposes a data-driven approach to automatically discover language biases encoded in the vocabulary of online discourse communities on Reddit. In our approach, protected attributes are connected to evaluative words found in the data, which are then categorised through a semantic analysis system. We verify the effectiveness of our method by comparing the biases we discover in the Google News dataset with those found in previous literature. We then successfully discover gender bias, religion bias, and ethnic bias in different Reddit communities. We conclude by discussing potential application scenarios and limitations of this data-driven bias discovery method.

4.1AIJul 14, 2020Code

A Normative approach to Attest Digital Discrimination

Natalia Criado, Xavier Ferrer, Jose M. Such

Digital discrimination is a form of discrimination whereby users are automatically treated unfairly, unethically or just differently based on their personal data by a machine learning (ML) system. Examples of digital discrimination include low-income neighbourhood's targeted with high-interest loans or low credit scores, and women being undervalued by 21% in online marketing. Recently, different techniques and tools have been proposed to detect biases that may lead to digital discrimination. These tools often require technical expertise to be executed and for their results to be interpreted. To allow non-technical users to benefit from ML, simpler notions and concepts to represent and reason about digital discrimination are needed. In this paper, we use norms as an abstraction to represent different situations that may lead to digital discrimination. In particular, we formalise non-discrimination norms in the context of ML systems and propose an algorithm to check whether ML systems violate these norms.

8.5AISep 10, 2019

Attesting Biases and Discrimination using Language Semantics

Xavier Ferrer Aran, Jose M. Such, Natalia Criado

AI agents are increasingly deployed and used to make automated decisions that affect our lives on a daily basis. It is imperative to ensure that these systems embed ethical principles and respect human values. We focus on how we can attest to whether AI agents treat users fairly without discriminating against particular individuals or groups through biases in language. In particular, we discuss human unconscious biases, how they are embedded in language, and how AI systems inherit those biases by learning from and processing human language. Then, we outline a roadmap for future research to better understand and attest problematic AI biases derived from language.

15.6CRMar 13, 2019

Smart Home Personal Assistants: A Security and Privacy Review

Jide S. Edu, Jose M. Such, Guillermo Suarez-Tangil

Smart Home Personal Assistants (SPA) are an emerging innovation that is changing the way in which home users interact with the technology. However, there are a number of elements that expose these systems to various risks: i) the open nature of the voice channel they use, ii) the complexity of their architecture, iii) the AI features they rely on, and iv) their use of a wide-range of underlying technologies. This paper presents an in-depth review of the security and privacy issues in SPA, categorizing the most important attack vectors and their countermeasures. Based on this, we discuss open research challenges that can help steer the community to tackle and address current security and privacy issues in SPA. One of our key findings is that even though the attack surface of SPA is conspicuously broad and there has been a significant amount of recent research efforts in this area, research has so far focused on a small part of the attack surface, particularly on issues related to the interaction between the user and the SPA devices. We also point out that further research is needed to tackle issues related to authorization, speech recognition or profiling, to name a few. To the best of our knowledge, this is the first article to conduct such a comprehensive review and characterization of the security and privacy issues and countermeasures of SPA.

6.8CRJan 17, 2019

The Security of Smart Buildings: a Systematic Literature Review

Pierre Ciholas, Aidan Lennie, Parvin Sadigova et al.

Smart Buildings are networks of connected devices and software in charge of automatically managing and controlling several building functions such as HVAC, fire alarms, lighting, shading and more. These systems evolved from mostly electronic and mechanical elements to complex systems relying on IT and wireless technologies and networks. This exposes smart buildings to new risks and threats that need to be enumerated and addressed. Research efforts have been done in several areas related to security in smart buildings but a clear overview of the research field is missing. In this paper, we present the results of a systematic literature review that provides a thorough understanding of the state of the art in research on the security of smart buildings. We found that the field of smart buildings security is growing significantly in complexity due to the many protocols introduced recently and that the research community is already studying. We also found an important lack of empirical evaluations, though evaluations on testbeds and real systems seems to be growing. Finally, we found an almost complete lack of consideration of non-technical aspects, such as social, organisational, and human factors, which are crucial in this type of systems, where ownership and liability is not always clear.

1.2MAMay 15, 2015

Norm Monitoring under Partial Action Observability

Natalia Criado, Jose M. Such

In the context of using norms for controlling multi-agent systems, a vitally important question that has not yet been addressed in the literature is the development of mechanisms for monitoring norm compliance under partial action observability. This paper proposes the reconstruction of unobserved actions to tackle this problem. In particular, we formalise the problem of reconstructing unobserved actions, and propose an information model and algorithms for monitoring norms under partial action observability using two different processes for reconstructing unobserved actions. Our evaluation shows that reconstructing unobserved actions increases significantly the number of norm violations and fulfilments detected.

5.9SIFeb 9, 2015

Implicit Contextual Integrity in Online Social Networks

Natalia Criado, Jose M. Such

Many real incidents demonstrate that users of Online Social Networks need mechanisms that help them manage their interactions by increasing the awareness of the different contexts that coexist in Online Social Networks and preventing them from exchanging inappropriate information in those contexts or disseminating sensitive information from some contexts to others. Contextual integrity is a privacy theory that conceptualises the appropriateness of information sharing based on the contexts in which this information is to be shared. Computational models of Contextual Integrity assume the existence of well-defined contexts, in which individuals enact pre-defined roles and information sharing is governed by an explicit set of norms. However, contexts in Online Social Networks are known to be implicit, unknown a priori and ever changing; users relationships are constantly evolving; and the information sharing norms are implicit. This makes current Contextual Integrity models not suitable for Online Social Networks. In this paper, we propose the first computational model of Implicit Contextual Integrity, presenting an information model and an Information Assistant Agent that uses the information model to learn implicit contexts, relationships and the information sharing norms to help users avoid inappropriate information exchanges and undesired information disseminations. Through an experimental evaluation, we validate the properties of Information Assistant Agents, which are shown to: infer the information sharing norms even if a small proportion of the users follow the norms and in presence of malicious users; help reduce the exchange of inappropriate information and the dissemination of sensitive information with only a partial view of the system and the information received and sent by their users; and minimise the burden to the users in terms of raising unnecessary alerts.