Rudy Arthur

SI
6papers
135citations
Novelty35%
AI Score40

6 Papers

HCJul 10, 2024
The Language of Weather: Social Media Reactions to Weather Accounting for Climatic and Linguistic Baselines

James C. Young, Rudy Arthur, Hywel T. P. Williams

This study explores how different weather conditions influence public sentiment on social media, focusing on Twitter data from the UK. By considering climate and linguistic baselines, we improve the accuracy of weather-related sentiment analysis. Our findings show that emotional responses to weather are complex, influenced by combinations of weather variables and regional language differences. The results highlight the importance of context-sensitive methods for better understanding public mood in response to weather, which can enhance impact-based forecasting and risk communication in the context of climate change.

8.1SIMar 19
Measuring ESG Risk in Supply Networks

Rudy Arthur, Guillherme Machado

Environmental, Social and Governance (ESG) rating is a way for investors to prioritise investments in companies with good corporate behaviour. However, ESG ratings are vulnerable to greenwashing in a number of ways. In this paper we study the effect that trade with badly rated companies has on a target company's own rating. To do this we introduce a measurement framework, generalising PageRank and Alpha Centrality, which allows tuning of aggregation and path counting approaches to resist greenwashing and reflect the rater's opinions and preferences for harm accumulation. These metrics allow updating of the target's ESG rating, identification of influential neighbours and assessment of vulnerability of the target to bad behaviour in their supply network. We study these metrics on synthetic ESG interaction networks as well as a real inter-company network and the international trade network.

5.4SIApr 21
Community Detection with the Canonical Ensemble

Rudy Arthur

Network community detection is usually considered as an unsupervised learning problem. Given a network, the aim is to partition it using some general purpose algorithm. In this paper we instead treat community detection as a hypothesis testing problem. Given a network, we examine the evidence for specific community structure in the observed network compared to a null model. To do this we define an appropriate test statistic, analogous to a z-score, and several null models derived from maximising entropy under different constraints in the canonical ensemble. We demonstrate the application of this method on real and synthetic data and contrast our method to Bayesian approaches based on the stochastic block model. We demonstrate that this method gives definitive answers to concrete questions, which can be more useful to analysts than the output of a generic algorithm.

CLJul 15, 2023
CIDER: Context sensitive sentiment analysis for short-form text

James C. Young, Rudy Arthur, Hywel T. P. Williams

Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/.

SINov 27, 2017
Scaling laws in geo-located Twitter data

Rudy Arthur, Hywel Williams

We observe and report on a systematic relationship between population density and Twitter use. Number of tweets, number of users and population per unit area are related by power laws, with exponents greater than one, that are consistent with each other and across a range of spatial scales. This implies that population density can accurately predict Twitter activity. Furthermore this trend can be used to identify `anomalous' areas that deviate from the trend. Analysis of geo-tagged and place-tagged tweets show that geo-tagged tweets are different with respect to user type and content. Our findings have implications for the spatial analysis of Twitter data and for understanding demographic biases in the Twitter user base.

HCNov 13, 2017
Social Sensing of Floods in the UK

Rudy Arthur, Chris A. Boulton, Humphrey Shotton et al.

"Social sensing" is a form of crowd-sourcing that involves systematic analysis of digital communications to detect real-world events. Here we consider the use of social sensing for observing natural hazards. In particular, we present a case study that uses data from a popular social media platform (Twitter) to detect and locate flood events in the UK. In order to improve data quality we apply a number of filters (timezone, simple text filters and a naive Bayes `relevance' filter) to the data. We then use place names in the user profile and message text to infer the location of the tweets. These two steps remove most of the irrelevant tweets and yield orders of magnitude more located tweets than we have by relying on geo-tagged data. We demonstrate that high resolution social sensing of floods is feasible and we can produce high-quality historical and real-time maps of floods using Twitter.