CRGNAug 14, 2021

Privacy-Preserving Identification of Target Patients from Outsourced Patient Data

arXiv:2108.06505v2
AI Analysis

This addresses data privacy concerns for hospitals and data owners outsourcing patient data to cloud providers, offering a novel solution for secure analytics in healthcare, though it is incremental in combining encryption and search techniques.

The paper tackles the problem of identifying target patients from encrypted multi-tenant phenotype data outsourced to cloud service providers, enabling efficient and privacy-preserving selection for tasks like genome-wide association studies, with experimental validation on a real-life genomic dataset showing effective group identification.

With the increasing affordability and availability of patient data, hospitals tend to outsource their data to cloud service providers (CSPs) for the purpose of storage and analytics. However, the concern of data privacy significantly limits the data owners' choice. In this work, we propose the first solution, to the best of our knowledge, that allows a CSP to perform efficient identification of target patients (e.g., pre-processing for a genome-wide association study - GWAS) over multi-tenant encrypted phenotype data (owned by multiple hospitals or data owners). We first propose an encryption mechanism for phenotype data, where each data owner is allowed to encrypt its data with a unique secret key. Moreover, the ciphertext supports privacy-preserving search and, consequently, enables the selection of the target group of patients (e.g., case and control groups). In addition, we provide a per-query based authorization mechanism for a client to access and operate on the data stored at the CSP. Based on the identified patients, the proposed scheme can either (i) directly conduct GWAS (i.e., computation of statistics about genomic variants) at the CSP or (ii) provide the identified groups to the client to directly query the corresponding data owners and conduct GWAS using existing distributed solutions. We implement the proposed scheme and run experiments over a real-life genomic dataset to show its effectiveness. The result shows that the proposed solution is capable to efficiently identify the case/control groups in a privacy-preserving way.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes