Props for Machine-Learning Security
This addresses the systemic bottleneck of data scarcity in ML development, potentially benefiting researchers and practitioners by allowing safe use of sensitive or deep-web data.
The paper tackles the problem of limited high-quality training data in machine learning by proposing 'props' (protected pipelines) for authenticated, privacy-preserving access to deep-web data, enabling secure use of vast data sources and privacy-preserving inference.
We propose protected pipelines or props for short, a new approach for authenticated, privacy-preserving access to deep-web data for machine learning (ML). By permitting secure use of vast sources of deep-web data, props address the systemic bottleneck of limited high-quality training data in ML development. Props also enable privacy-preserving and trustworthy forms of inference, allowing for safe use of sensitive data in ML applications. Props are practically realizable today by leveraging privacy-preserving oracle systems initially developed for blockchain applications.