A Deployment Audit of Release-Side Risk in Conformal Triage under Prevalence Shift
For safety-critical applications like medical triage, this audit addresses the overlooked risk of releasing event-positive patients without review, which standard marginal coverage summaries miss.
The paper introduces a leakage-aware deployment audit for conformal triage under prevalence shift, revealing that lower review rates can be misleading because they release more event-positive patients. Applied to a retrospective NSCLC pilot, the audit shows insufficient event labels to certify safe low-review release.
Conformal triage converts predictive scores into deployment actions that either release a case, flag it for urgent attention, or defer it to human review. Under prevalence shift, however, the usual summaries of marginal coverage and human-review rate can miss the safety-critical question of whether patients who truly experience the target event are released without review. To address this gap, we introduce a leakage-aware deployment audit for release-side conformal triage. It first assigns target subjects to three non-overlapping roles: prevalence correction, conformal calibration, and held-out release-safety evaluation. This separation then lets the audit evaluate release directly: how many event-positive patients are cleared without review, whether the pilot has enough event labels for calibration, and how the safety-review trade-off shifts. Applying this audit to a retrospective NSCLC pilot shows why lower review can be misleading: after prevalence correction, the pooled conformal branch lowers review by releasing more patients, some of whom are event-positive. Within the audit, the classwise branch acts as a scarcity diagnostic: the pilot has too few event labels to certify safe low-review release.