CL AIAug 12, 2022

Is Your Model Sensitive? SPeDaC: A New Benchmark for Detecting and Classifying Sensitive Personal Data

Gaia Gambarelli, Aldo Gangemi, Rocco Tripodi

arXiv:2208.06216v30.316 citationsh-index: 57

Originality Incremental advance

AI Analysis

This addresses the problem of personal data protection in applications like dialogue systems by providing a benchmark for researchers, though it is incremental as it builds on existing SID approaches.

The paper tackles the lack of a shared benchmark for detecting sensitive personal data by introducing SPeDaC, a new annotated resource for English, with results showing transformer models achieving up to 98.20% accuracy on binary classification and 77.63% on fine-grained tasks.

In recent years, there has been an exponential growth of applications, including dialogue systems, that handle sensitive personal information. This has brought to light the extremely important issue of personal data protection in virtual environments. Sensitive Information Detection (SID) approaches different domains and languages in literature. However, if we refer to the personal data domain, a shared benchmark or the absence of an available labeled resource makes comparison with the state-of-the-art difficult. We introduce and release SPeDaC , a new annotated resource for the identification of sensitive personal data categories in the English language. SPeDaC enables the evaluation of computational models for three different SID subtasks with increasing levels of complexity. SPeDaC 1 regards binary classification, a model has to detect if a sentence contains sensitive information or not; whereas, in SPeDaC 2 we collected labeled sentences using 5 categories that relate to macro-domains of personal information; in SPeDaC 3, the labeling is fine-grained (61 personal data categories). We conduct an extensive evaluation of the resource using different state-of-the-art-classifiers. The results show that SPeDaC is challenging, particularly with regard to fine-grained classification. The transformer models achieve the best results (acc. RoBERTa on SPeDaC 1 = 98.20%, DeBERTa on SPeDaC 2 = 95.81% and SPeDaC 3 = 77.63%).

View on arXiv PDF

Similar