CYLGJun 8, 2020

Ethical Considerations and Statistical Analysis of Industry Involvement in Machine Learning Research

arXiv:2006.04541v230 citations
AI Analysis

This work addresses ethical and statistical concerns about industry influence in ML research for the broader community, though it is incremental as it builds on existing discourse with new data.

The study quantified industry involvement in machine learning research by analyzing nearly 11,000 papers from NeurIPS, CVPR, and ICML over five years, finding that academic-corporate collaborations are growing, industry leads in trending topics by two years, and industry papers lag in gender diversity.

Industry involvement in the machine learning (ML) community seems to be increasing. However, the quantitative scale and ethical implications of this influence are rather unknown. For this purpose, we have not only carried out an informed ethical analysis of the field, but have inspected all papers of the main ML conferences NeurIPS, CVPR, and ICML of the last 5 years - almost 11,000 papers in total. Our statistical approach focuses on conflicts of interest, innovation and gender equality. We have obtained four main findings: (1) Academic-corporate collaborations are growing in numbers. At the same time, we found that conflicts of interest are rarely disclosed. (2) Industry publishes papers about trending ML topics on average two years earlier than academia does. (3) Industry papers are not lagging behind academic papers in regard to social impact considerations. (4) Finally, we demonstrate that industrial papers fall short of their academic counterparts with respect to the ratio of gender diversity. We believe that this work is a starting point for an informed debate within and outside of the ML community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes