Joseph Aylett-Bullock

h-index11

7papers

84citations

Novelty33%

AI Score38

Ranked #84,072 of 194,257 authors (top 43%)#15,913 in CL (top 52%)

7 Papers

6.4LGApr 3Code

PRISM: LLM-Guided Semantic Clustering for High-Precision Topics

Connor Douglas, Utkucan Balci, Joseph Aylett-Bullock

In this paper, we propose Precision-Informed Semantic Modeling (PRISM), a structured topic modeling framework combining the benefits of rich representations captured by LLMs with the low cost and interpretability of latent semantic clustering methods. PRISM fine-tunes a sentence encoding model using a sparse set of LLM- provided labels on samples drawn from some corpus of interest. We segment this embedding space with thresholded clustering, yielding clusters that separate closely related topics within some narrow domain. Across multiple corpora, PRISM improves topic separability over state-of-the-art local topic models and even over clustering on large, frontier embedding models while requiring only a small number of LLM queries to train. This work contributes to several research streams by providing (i) a student-teacher pipeline to distill sparse LLM supervision into a lightweight model for topic discovery; (ii) an analysis of the efficacy of sampling strategies to improve local geometry for cluster separability; and (iii) an effective approach for web-scale text analysis, enabling researchers and practitioners to track nuanced claims and subtopics online with an interpretable, locally deployable framework.

2.3CYAug 24, 2022

Multi-AI Complex Systems in Humanitarian Response

Joseph Aylett-Bullock, Miguel Luengo-Oroz

AI is being increasingly used to aid response efforts to humanitarian emergencies at multiple levels of decision-making. Such AI systems are generally understood to be stand-alone tools for decision support, with ethical assessments, guidelines and frameworks applied to them through this lens. However, as the prevalence of AI increases in this domain, such systems will begin to encounter each other through information flow networks created by interacting decision-making entities, leading to multi-AI complex systems which are often ill understood. In this paper we describe how these multi-AI systems can arise, even in relatively simple real-world humanitarian response scenarios, and lead to potentially emergent and erratic erroneous behavior. We discuss how we can better work towards more trustworthy multi-AI systems by exploring some of the associated challenges and opportunities, and how we can design better mechanisms to understand and assess such systems. This paper is designed to be a first exposition on this topic in the field of humanitarian response, raising awareness, exploring the possible landscape of this domain, and providing a starting point for future work within the wider community.

0.2CLAug 11, 2021

Ensuring the Inclusive Use of Natural Language Processing in the Global Response to COVID-19

Alexandra Sasha Luccioni, Katherine Hoffmann Pham, Cynthia Sin Nga Lam et al.

Natural language processing (NLP) plays a significant role in tools for the COVID-19 pandemic response, from detecting misinformation on social media to helping to provide accurate clinical information or summarizing scientific research. However, the approaches developed thus far have not benefited all populations, regions or languages equally. We discuss ways in which current and future NLP approaches can be made more inclusive by covering low-resource languages, including alternative modalities, leveraging out-of-the-box tools and forming meaningful partnerships. We suggest several future directions for researchers interested in maximizing the positive societal impacts of NLP.

8.6HEP-PHJun 17, 2021Code

Optimising simulations for diphoton production at hadron colliders using amplitude neural networks

Joseph Aylett-Bullock, Simon Badger, Ryan Moodie

Machine learning technology has the potential to dramatically optimise event generation and simulations. We continue to investigate the use of neural networks to approximate matrix elements for high-multiplicity scattering processes. We focus on the case of loop-induced diphoton production through gluon fusion and develop a realistic simulation method that can be applied to hadron collider observables. Neural networks are trained using the one-loop amplitudes implemented in the NJet C++ library and interfaced to the Sherpa Monte Carlo event generator where we perform a detailed study for $2\to3$ and $2\to4$ scattering problems. We also consider how the trained networks perform when varying the kinematic cuts effecting the phase space and the reliability of the neural network simulations.

1.2CYAug 13, 2020

Considerations, Good Practices, Risks and Pitfalls in Developing AI Solutions Against COVID-19

Alexandra Luccioni, Joseph Bullock, Katherine Hoffmann Pham et al.

The COVID-19 pandemic has been a major challenge to humanity, with 12.7 million confirmed cases as of July 13th, 2020 [1]. In previous work, we described how Artificial Intelligence can be used to tackle the pandemic with applications at the molecular, clinical, and societal scales [2]. In the present follow-up article, we review these three research directions, and assess the level of maturity and feasibility of the approaches used, as well as their potential for operationalization. We also summarize some commonly encountered risks and practical pitfalls, as well as guidelines and best practices for formulating and deploying AI applications at different scales.

6.5CVJan 29, 2020

PulseSatellite: A tool using human-AI feedback loops for satellite image analysis in humanitarian contexts

Tomaz Logar, Joseph Bullock, Edoardo Nemni et al.

Humanitarian response to natural disasters and conflicts can be assisted by satellite image analysis. In a humanitarian context, very specific satellite image analysis tasks must be done accurately and in a timely manner to provide operational support. We present PulseSatellite, a collaborative satellite image analysis tool which leverages neural network models that can be retrained on-the fly and adapted to specific humanitarian contexts and geographies. We present two case studies, in mapping shelters and floods respectively, that illustrate the capabilities of PulseSatellite.

1.5CLJun 5, 2019

Automated Speech Generation from UN General Assembly Statements: Mapping Risks in AI Generated Texts

Joseph Bullock, Miguel Luengo-Oroz

Automated text generation has been applied broadly in many domains such as marketing and robotics, and used to create chatbots, product reviews and write poetry. The ability to synthesize text, however, presents many potential risks, while access to the technology required to build generative models is becoming increasingly easy. This work is aligned with the efforts of the United Nations and other civil society organisations to highlight potential political and societal risks arising through the malicious use of text generation software, and their potential impact on human rights. As a case study, we present the findings of an experiment to generate remarks in the style of political leaders by fine-tuning a pretrained AWD- LSTM model on a dataset of speeches made at the UN General Assembly. This work highlights the ease with which this can be accomplished, as well as the threats of combining these techniques with other technologies.