Liam Burke

CLJul 31, 2023

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

Angus R. Williams, Hannah Rose Kirk, Liam Burke et al. · oxford

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.

86.2NAMar 30

On the numerical stability of sketched GMRES

Liam Burke, Erin Carson, Yuxin Ma

We perform a backward stability analysis of preconditioned sketched GMRES [Nakatsukasa and Tropp, SIAM J. Matrix Anal. Appl, 2024] for solving linear systems $Ax=b$, and show that the backward stability at iteration $i$ depends on the conditioning of the Krylov basis $B_{1:i}$ as long as the condition number of $A B_{1:i}$ can be bounded by $1/O(u)$, where $u$ is the unit roundoff. Under this condition, we show that sketched GMRES is backward stable as long as the condition number of $B_{1:i}$ is not too large. Under additional assumptions, we then show that the stability of a restarted implementation of sketched GMRES can be independent of the condition number of $B_{1:i}$, and restarted sketched GMRES is backward stable. We also derive sharper bounds that better capture the attainable backward error especially for cases when the basis $B_{1:i}$ is very ill-conditioned, which has been observed in the literature but not yet explained theoretically. We present numerical experiments to demonstrate the conclusions of our analysis, and also show that adaptively restarting where appropriate allows us to recover backward stability in sketched GMRES.

Liam Burke

2 Papers