CLLGOct 1, 2018

Utilizing a Transparency-driven Environment toward Trusted Automatic Genre Classification: A Case Study in Journalism History

arXiv:1810.00968v1
Originality Synthesis-oriented
AI Analysis

This work addresses the need for trusted and responsible machine learning usage in real-world tasks like journalism history, though it is incremental as it builds on existing transparency methods.

The paper tackled the problem of black-box machine learning models achieving high accuracy for potentially wrong reasons by developing a transparency-driven environment to help non-computer scientists, such as journalism historians, understand and trust automatic genre classification of newspaper articles, resulting in gradually increasing user comprehension through a real-world case study.

With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field. We demonstrate the gradually increasing understanding of journalism historians through a real-world use case study on automatic genre classification of newspaper articles. This study is a first step towards trusted usage of machine learning pipelines in a responsible way.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes