CLAIApr 5, 2023

The Saudi Privacy Policy Dataset

arXiv:2304.02757v15 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This provides a resource for researchers, policymakers, and professionals to analyze privacy policies and promote compliance in Saudi Arabia, but it is incremental as it applies existing annotation methods to new data.

The paper introduces the Saudi Privacy Policy Dataset, a collection of 1,000 Arabic privacy policies from Saudi Arabia annotated according to the Personal Data Protection Law, with 4,638 lines of text and 775,370 tokens, to support compliance assessment and automated monitoring tools.

This paper introduces the Saudi Privacy Policy Dataset, a diverse compilation of Arabic privacy policies from various sectors in Saudi Arabia, annotated according to the 10 principles of the Personal Data Protection Law (PDPL); the PDPL was established to be compatible with General Data Protection Regulation (GDPR); one of the most comprehensive data regulations worldwide. Data were collected from multiple sources, including the Saudi Central Bank, the Saudi Arabia National United Platform, the Council of Health Insurance, and general websites using Google and Wikipedia. The final dataset includes 1,000 websites belonging to 7 sectors, 4,638 lines of text, 775,370 tokens, and a corpus size of 8,353 KB. The annotated dataset offers significant reuse potential for assessing privacy policy compliance, benchmarking privacy practices across industries, and developing automated tools for monitoring adherence to data protection regulations. By providing a comprehensive and annotated dataset of privacy policies, this paper aims to facilitate further research and development in the areas of privacy policy analysis, natural language processing, and machine learning applications related to privacy and data protection, while also serving as an essential resource for researchers, policymakers, and industry professionals interested in understanding and promoting compliance with privacy regulations in Saudi Arabia.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes