George Kollios

CR
3papers
45citations
Novelty55%
AI Score24

3 Papers

CRJun 5, 2017
$\mathcal{E}\text{psolute}$: Efficiently Querying Databases While Providing Differential Privacy

Dmytro Bogatov, Georgios Kellaris, George Kollios et al.

As organizations struggle with processing vast amounts of information, outsourcing sensitive data to third parties becomes a necessity. To protect the data, various cryptographic techniques are used in outsourced database systems to ensure data privacy, while allowing efficient querying. A rich collection of attacks on such systems has emerged. Even with strong cryptography, just communication volume or access pattern is enough for an adversary to succeed. In this work we present a model for differentially private outsourced database system and a concrete construction, $\mathcal{E}\text{psolute}$, that provably conceals the aforementioned leakages, while remaining efficient and scalable. In our solution, differential privacy is preserved at the record level even against an untrusted server that controls data and queries. $\mathcal{E}\text{psolute}$ combines Oblivious RAM and differentially private sanitizers to create a generic and efficient construction. We go further and present a set of improvements to bring the solution to efficiency and practicality necessary for real-world adoption. We describe the way to parallelize the operations, minimize the amount of noise, and reduce the number of network requests, while preserving the privacy guarantees. We have run an extensive set of experiments, dozens of servers processing up to 10 million records, and compiled a detailed result analysis proving the efficiency and scalability of our solution. While providing strong security and privacy guarantees we are less than an order of magnitude slower than range query execution of a non-secure plain-text optimized RDBMS like MySQL and PostgreSQL.

DBFeb 7, 2016
NED: An Inter-Graph Node Metric Based On Edit Distance

Haohan Zhu, Xianrui Meng, George Kollios

Node similarity is a fundamental problem in graph analytics. However, node similarity between nodes in different graphs (inter-graph nodes) has not received a lot of attention yet. The inter-graph node similarity is important in learning a new graph based on the knowledge of an existing graph (transfer learning on graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring inter-graph node similarity with edit distance, called NED. In NED, two nodes are compared according to their local neighborhood structures which are represented as unordered k-adjacent trees, without relying on labels or other assumptions. Since the computation problem of tree edit distance on unordered trees is NP-Complete, we propose a modified tree edit distance, called TED*, for comparing neighborhood trees. TED* is a metric distance, as the original tree edit distance, but more importantly, TED* is polynomially computable. As a metric distance, NED admits efficient indexing, provides interpretable results, and shows to perform better than existing approaches on a number of data analysis tasks, including graph de-anonymization. Finally, the efficiency and effectiveness of NED are empirically demonstrated using real-world graphs.

CROct 17, 2015
Top-k Query Processing on Encrypted Databases with Strong Security Guarantees

Xianrui Meng, Haohan Zhu, George Kollios

Privacy concerns in outsourced cloud databases have become more and more important recently and many efficient and scalable query processing methods over encrypted data have been proposed. However, there is very limited work on how to securely process top-k ranking queries over encrypted databases in the cloud. In this paper, we focus exactly on this problem: secure and efficient processing of top-k queries over outsourced databases. In particular, we propose the first efficient and provable secure top-k query processing construction that achieves adaptively CQA security. We develop an encrypted data structure called EHL and describe several secure sub-protocols under our security model to answer top-k queries. Furthermore, we optimize our query algorithms for both space and time efficiency. Finally, in the experiments, we empirically analyze our protocol using real world datasets and demonstrate that our construction is efficient and practical.