Simon D Angus

2papers

2 Papers

CLAug 19, 2024
Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs

Simon D Angus, Lachlan O'Neill

Detecting issue framing in text - how different perspectives approach the same topic - is valuable for social science and policy analysis, yet challenging for automated methods due to subtle linguistic differences. We introduce `paired completion', a novel approach using LLM next-token log probabilities to detect contrasting frames using minimal examples. Through extensive evaluation across synthetic datasets and a human-labeled corpus, we demonstrate that paired completion is a cost-efficient, low-bias alternative to both prompt-based and embedding-based methods, offering a scalable solution for analyzing issue framing in large text collections, especially suited to low-resource settings.

GNJan 19, 2017
The Internet as Quantitative Social Science Platform: Insights from a Trillion Observations

Klaus Ackermann, Simon D Angus, Paul A Raschky

With the large-scale penetration of the internet, for the first time, humanity has become linked by a single, open, communications platform. Harnessing this fact, we report insights arising from a unified internet activity and location dataset of an unparalleled scope and accuracy drawn from over a trillion (1.5$\times 10^{12}$) observations of end-user internet connections, with temporal resolution of just 15min over 2006-2012. We first apply this dataset to the expansion of the internet itself over 1,647 urban agglomerations globally. We find that unique IP per capita counts reach saturation at approximately one IP per three people, and take, on average, 16.1 years to achieve; eclipsing the estimated 100- and 60- year saturation times for steam-power and electrification respectively. Next, we use intra-diurnal internet activity features to up-scale traditional over-night sleep observations, producing the first global estimate of over-night sleep duration in 645 cities over 7 years. We find statistically significant variation between continental, national and regional sleep durations including some evidence of global sleep duration convergence. Finally, we estimate the relationship between internet concentration and economic outcomes in 411 OECD regions and find that the internet's expansion is associated with negative or positive productivity gains, depending strongly on sectoral considerations. To our knowledge, our study is the first of its kind to use online/offline activity of the entire internet to infer social science insights, demonstrating the unparalleled potential of the internet as a social data-science platform.