Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance
This work addresses safety assurance for autonomous driving datasets, but it is incremental as it builds on existing guidelines and frameworks without introducing new methods or paradigms.
The paper tackles the problem of ensuring dataset integrity for AI systems in autonomous driving by proposing a structured framework aligned with ISO/PAS 8800 guidelines, which includes safety analyses and verification strategies to mitigate risks from dataset insufficiencies.
Dataset integrity is fundamental to the safety and reliability of AI systems, especially in autonomous driving. This paper presents a structured framework for developing safe datasets aligned with ISO/PAS 8800 guidelines. Using AI-based perception systems as the primary use case, it introduces the AI Data Flywheel and the dataset lifecycle, covering data collection, annotation, curation, and maintenance. The framework incorporates rigorous safety analyses to identify hazards and mitigate risks caused by dataset insufficiencies. It also defines processes for establishing dataset safety requirements and proposes verification and validation strategies to ensure compliance with safety standards. In addition to outlining best practices, the paper reviews recent research and emerging trends in dataset safety and autonomous vehicle development, providing insights into current challenges and future directions. By integrating these perspectives, the paper aims to advance robust, safety-assured AI systems for autonomous driving applications.