Jacob Ritchie

CL
3papers
71citations
Novelty52%
AI Score24

3 Papers

CYAug 13, 2020
Analyzing Who and What Appears in a Decade of US Cable TV News

James Hong, Will Crichton, Haotian Zhang et al.

Cable TV news reaches millions of U.S. households each day, meaning that decisions about who appears on the news and what stories get covered can profoundly influence public opinion and discourse. We analyze a data set of nearly 24/7 video, audio, and text captions from three U.S. cable TV networks (CNN, FOX, and MSNBC) from January 2010 to July 2019. Using machine learning tools, we detect faces in 244,038 hours of video, label each face's presented gender, identify prominent public figures, and align text captions to audio. We use these labels to perform screen time and word frequency analyses. For example, we find that overall, much more screen time is given to male-presenting individuals than to female-presenting individuals (2.4x in 2010 and 1.9x in 2019). We present an interactive web-based tool, accessible at https://tvnews.stanford.edu, that allows the general public to perform their own analyses on the full cable TV news data set.

HCAug 1, 2019
Common Fate for Animated Transitions in Visualization

Amira Chalbi, Jacob Ritchie, Deokgun Park et al.

The Law of Common Fate from Gestalt psychology states that visual objects moving with the same velocity along parallel trajectories will be perceived by a human observer as grouped. However, the concept of common fate is much broader than mere velocity; in this paper we explore how common fate results from coordinated changes in luminance and size. We present results from a crowdsourced graphical perception study where we asked workers to make perceptual judgments on a series of trials involving four graphical objects under the influence of conflicting static and dynamic visual factors (position, size and luminance) used in conjunction. Our results yield the following rankings for visual grouping: motion > (dynamic luminance, size, luminance); dynamic size > (dynamic luminance, position); and dynamic luminance > size. We also conducted a follow-up experiment to evaluate the three dynamic visual factors in a more ecologically valid setting, using both a Gapminder-like animated scatterplot and a thematic map of election data. The results indicate that in practice the relative grouping strengths of these factors may depend on various parameters including the visualization characteristics and the underlying data. We discuss design implications for animated transitions in data visualization.

CLJul 4, 2019
Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study

Derek Howard, Marta Maslej, Justin Lee et al.

Mental illness affects a significant portion of the worldwide population. Online mental health forums can provide a supportive environment for those afflicted and also generate a large amount of data which can be mined to predict mental health states using machine learning methods. We benchmark multiple methods of text feature representation for social media posts and compare their downstream use with automated machine learning (AutoML) tools to triage content for moderator attention. We used 1588 labeled posts from the CLPsych 2017 shared task collected from the Reachout.com forum (Milne et al., 2019). Posts were represented using lexicon based tools including VADER, Empath, LIWC and also used pre-trained artificial neural network models including DeepMoji, Universal Sentence Encoder, and GPT-1. We used TPOT and auto-sklearn as AutoML tools to generate classifiers to triage the posts. The top-performing system used features derived from the GPT-1 model, which was finetuned on over 150,000 unlabeled posts from Reachout.com. Our top system had a macro averaged F1 score of 0.572, providing a new state-of-the-art result on the CLPsych 2017 task. This was achieved without additional information from meta-data or preceding posts. Error analyses revealed that this top system often misses expressions of hopelessness. We additionally present visualizations that aid understanding of the learned classifiers. We show that transfer learning is an effective strategy for predicting risk with relatively little labeled data. We note that finetuning of pretrained language models provides further gains when large amounts of unlabeled text is available.