CR CY DS LG MLSep 6, 2022

Classification Protocols with Minimal Disclosure

Jinshuo Dong, Jason Hartline, Aravindan Vijayaraghavan

arXiv:2209.02690v15.24 citationsh-index: 45

Originality Incremental advance

AI Analysis

This addresses privacy and efficiency in legal document discovery, but it is incremental as it builds on existing classification frameworks with specific assumptions.

The paper tackles the problem of multi-party classification protocols for applications like e-discovery, ensuring the requesting party receives all responsive documents while the sending party discloses minimal non-responsive documents. It shows that this protocol can be embedded in a machine learning framework, making it equivalent to a standard one-party classification problem under certain conditions, with formal guarantees for linear classifiers.

We consider multi-party protocols for classification that are motivated by applications such as e-discovery in court proceedings. We identify a protocol that guarantees that the requesting party receives all responsive documents and the sending party discloses the minimal amount of non-responsive documents necessary to prove that all responsive documents have been received. This protocol can be embedded in a machine learning framework that enables automated labeling of points and the resulting multi-party protocol is equivalent to the standard one-party classification problem (if the one-party classification problem satisfies a natural independence-of-irrelevant-alternatives property). Our formal guarantees focus on the case where there is a linear classifier that correctly partitions the documents.

View on arXiv PDF

Similar