CVLGApr 1, 2020

Objects of violence: synthetic data for practical ML in human rights investigations

arXiv:2004.01030v1Has Code
AI Analysis

This work addresses the challenge of expediting open-source intelligence for human rights researchers, though it is incremental as it builds on existing synthetic data and classification methods.

The paper tackles the problem of identifying munitions and military equipment in videos and images for human rights investigations with limited training data, introducing a workflow that uses synthetic data to train classifiers and demonstrating its application in real-world cases like the Triple-Chaser tear gas grenade and military presence in Ukraine.

We introduce a machine learning workflow to search for, identify, and meaningfully triage videos and images of munitions, weapons, and military equipment, even when limited training data exists for the object of interest. This workflow is designed to expedite the work of OSINT ("open source intelligence") researchers in human rights investigations. It consists of three components: automatic rendering and annotating of synthetic datasets that make up for a lack of training data; training image classifiers from combined sets of photographic and synthetic data; and mtriage, an open source software that orchestrates these classifiers' deployment to triage public domain media, and visualise predictions in a web interface. We show that synthetic data helps to train classifiers more effectively, and that certain approaches yield better results for different architectures. We then demonstrate our workflow in two real-world human rights investigations: the use of the Triple-Chaser tear gas grenade against civilians, and the verification of allegations of military presence in Ukraine in 2014.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes