Do You See What I See? Capabilities and Limits of Automated Multimedia Content Analysis
This work addresses the need to understand automated content analysis tools for improving content moderation while protecting free expression and privacy, particularly relevant during the COVID-19 pandemic.
The paper examines the capabilities and limitations of automated tools for analyzing online multimedia content, such as matching and predictive models, and highlights the risks of deploying them at scale without addressing these limitations.
The ever-increasing amount of user-generated content online has led, in recent years, to an expansion in research and investment in automated content analysis tools. Scrutiny of automated content analysis has accelerated during the COVID-19 pandemic, as social networking services have placed a greater reliance on these tools due to concerns about health risks to their moderation staff from in-person work. At the same time, there are important policy debates around the world about how to improve content moderation while protecting free expression and privacy. In order to advance these debates, we need to understand the potential role of automated content analysis tools. This paper explains the capabilities and limitations of tools for analyzing online multimedia content and highlights the potential risks of using these tools at scale without accounting for their limitations. It focuses on two main categories of tools: matching models and computer prediction models. Matching models include cryptographic and perceptual hashing, which compare user-generated content with existing and known content. Predictive models (including computer vision and computer audition) are machine learning techniques that aim to identify characteristics of new or previously unknown content.