Data-Driven Investigative Journalism For Connectas Dataset
This work addresses corruption detection for investigative journalism in Colombia, but it appears incremental as it applies standard ML methods to a new dataset.
The authors tackled the problem of detecting government corruption and malpractice in Colombia by applying machine learning algorithms to a dataset of government contracts from 2007 to 2012, resulting in anomaly detection models.
The following paper explores the possibility of using Machine Learning algorithms to detect the cases of corruption and malpractice by governments. The dataset used by the authors contains information about several government contracts in Colombia from year 2007 to 2012. The authors begin with exploring and cleaning the data, followed by which they perform feature engineering before finally implementing Machine Learning models to detect anomalies in the given dataset.