CRSep 13, 2021
Scrybe: A Secure Audit Trail for Clinical Trial Data FusionJon Oakley, Carl Worley, Lu Yu et al.
Clinical trials are a multi-billion dollar industry. One of the biggest challenges facing the clinical trial research community is satisfying Part 11 of Title 21 of the Code of Federal Regulations and ISO 27789. These controls provide audit requirements that guarantee the reliability of the data contained in the electronic records. Context-aware smart devices and wearable IoT devices have become increasingly common in clinical trials. Electronic Data Capture (EDC) and Clinical Data Management Systems (CDMS) do not currently address the new challenges introduced using these devices. The healthcare digital threat landscape is continually evolving, and the prevalence of sensor fusion and wearable devices compounds the growing attack surface. We propose Scrybe, a permissioned blockchain, to store proof of clinical trial data provenance. We illustrate how Scrybe addresses each control and the limitations of the Ethereum-based blockchains. Finally, we provide a proof-of-concept integration with REDCap to show tamper resistance.
CROct 15, 2019
Privacy Preserving Count StatisticsLu Yu, Oluwakemi Hambolu, Yu Fu et al.
The ability to preserve user privacy and anonymity is important. One of the safest ways to maintain privacy is to avoid storing personally identifiable information (PII), which poses a challenge for maintaining useful user statistics. Probabilistic counting has been used to find the cardinality of a multiset when precise counting is too resource intensive. In this paper, probabilistic counting is used as an anonymization technique that provides a reliable estimate of the number of unique users. We extend previous work in probabilistic counting by considering its use for preserving user anonymity, developing application guidelines and including hash collisions in the estimate. Our work complements previous method by attempting to explore the causes of the deviation of uncorrected estimate from the real value. The experimental results show that if the proper register size is used, collision compensation provides estimates are as good as, if not better than, the original probabilistic counting. We develop a new anonymity metric to precisely quantify the degree of anonymity the algorithm provides.
CRSep 22, 2017
Using Markov Models and Statistics to Learn, Extract, Fuse, and Detect Patterns in Raw DataRichard R. Brooks, Lu Yu, Yu Fu et al.
Many systems are partially stochastic in nature. We have derived data driven approaches for extracting stochastic state machines (Markov models) directly from observed data. This chapter provides an overview of our approach with numerous practical applications. We have used this approach for inferring shipping patterns, exploiting computer system side-channel information, and detecting botnet activities. For contrast, we include a related data-driven statistical inferencing approach that detects and localizes radiation sources.
CRMar 10, 2017
Provenance Threat ModelingOluwakemi Hambolu, Lu Yu, Jon Oakley et al.
Provenance systems are used to capture history metadata, applications include ownership attribution and determining the quality of a particular data set. Provenance systems are also used for debugging, process improvement, understanding data proof of ownership, certification of validity, etc. The provenance of data includes information about the processes and source data that leads to the current representation. In this paper we study the security risks provenance systems might be exposed to and recommend security solutions to better protect the provenance information.