Stop the Open Data Bus, We Want to Get Off
This reveals privacy risks in open transport data, particularly for vulnerable populations, and is incremental as it applies known re-identification methods to a new dataset.
The study tackled the problem of individual re-identification in the publicly released Myki public transport dataset, demonstrating that researchers could easily re-identify themselves, co-travellers, and strangers, highlighting risks to vulnerable groups.
The subject of this report is the re-identification of individuals in the Myki public transport dataset released as part of the Melbourne Datathon 2018. We demonstrate the ease with which we were able to re-identify ourselves, our co-travellers, and complete strangers; our analysis raises concerns about the nature and granularity of the data released, in particular the ability to identify vulnerable or sensitive groups.