George Danezis

27papers

3,892citations

Novelty55%

AI Score30

Ranked #145,426 of 205,806 authors (top 71%)#4,042 in CR (top 55%)

27 Papers

CRJan 14, 2022Code

Zef: Low-latency, Scalable, Private Payments

Mathieu Baudet, Alberto Sonnino, Mahimna Kelkar et al.

We introduce Zef, the first Byzantine-Fault Tolerant (BFT) protocol to support payments in anonymous digital coins at arbitrary scale. Zef follows the communication and security model of FastPay: both protocols are asynchronous, low-latency, linearly-scalable, and powered by partially-trusted sharded authorities. Zef further introduces opaque coins represented as off-chain certificates that are bound to user accounts. In order to hide the face values of coins when a payment operation consumes or creates them, Zef uses random commitments and NIZK proofs. Created coins are made unlinkable using the blind and randomizable threshold anonymous credentials of Coconut. To control storage costs associated with coin replay prevention, Zef accounts are designed so that data can be safely removed once an account is deactivated. Besides the specifications and a detailed analysis of the protocol, we are making available an open-source implementation of Zef in Rust. Our extensive benchmarks on AWS confirm textbook linear scalability and demonstrate a confirmation time under one second at nominal capacity. Compared to existing anonymous payment systems based on a blockchain, this represents a latency speedup of three orders of magnitude, with no theoretical limit on throughput.

CRMay 25, 2021

Narwhal and Tusk: A DAG-based Mempool and Efficient BFT Consensus

George Danezis, Eleftherios Kokoris Kogias, Alberto Sonnino et al.

We propose separating the task of reliable transaction dissemination from transaction ordering, to enable high-performance Byzantine fault-tolerant quorum-based consensus. We design and evaluate a mempool protocol, Narwhal, specializing in high-throughput reliable dissemination and storage of causal histories of transactions. Narwhal tolerates an asynchronous network and maintains high performance despite failures. Narwhal is designed to easily scale-out using multiple workers at each validator, and we demonstrate that there is no foreseeable limit to the throughput we can achieve. Composing Narwhal with a partially synchronous consensus protocol (Narwhal-HotStuff) yields significantly better throughput even in the presence of faults or intermittent loss of liveness due to asynchrony. However, loss of liveness can result in higher latency. To achieve overall good performance when faults occur we design Tusk, a zero-message overhead asynchronous consensus protocol, to work with Narwhal. We demonstrate its high performance under a variety of configurations and faults. As a summary of results, on a WAN, Narwhal-Hotstuff achieves over 130,000 tx/sec at less than 2-sec latency compared with 1,800 tx/sec at 1-sec latency for Hotstuff. Additional workers increase throughput linearly to 600,000 tx/sec without any latency increase. Tusk achieves 160,000 tx/sec with about 3 seconds latency. Under faults, both protocols maintain high throughput, but Narwhal-HotStuff suffers from increased latency.

CRMar 25, 2020

FastPay: High-Performance Byzantine Fault Tolerant Settlement

Mathieu Baudet, George Danezis, Alberto Sonnino

FastPay allows a set of distributed authorities, some of which are Byzantine, to maintain a high-integrity and availability settlement system for pre-funded payments. It can be used to settle payments in a native unit of value (crypto-currency), or as a financial side-infrastructure to support retail payments in fiat currencies. FastPay is based on Byzantine Consistent Broadcast as its core primitive, foregoing the expenses of full atomic commit channels (consensus). The resulting system has low-latency for both confirmation and payment finality. Remarkably, each authority can be sharded across many machines to allow unbounded horizontal scalability. Our experiments demonstrate intra-continental confirmation latency of less than 100ms, making FastPay applicable to point of sale payments. In laboratory environments, we achieve over 80,000 transactions per second with 20 authorities---surpassing the requirements of current retail card payment networks, while significantly increasing their robustness.

CRJun 28, 2019

SybilQuorum: Open Distributed Ledgers Through Trust Networks

Alberto Sonnino, George Danezis

The Sybil attack plagues all peer-to-peer systems, and modern open distributed ledgers employ a number of tactics to prevent it from proof of work, or other resources such as space, stake or memory, to traditional admission control in permissioned settings. With SybilQuorum we propose an alternative approach to securing an open distributed ledger against Sybil attacks, and ensuring consensus amongst honest participants, leveraging social network based Sybil defences. We show how nodes expressing their trust relationships through the ledger can bootstrap and operate a value system, and general transaction system, and how Sybil attacks are thwarted. We empirically evaluate our system as a secure Federated Byzantine Agreement System, and extend the theory of those systems to do so.

CRJan 31, 2019

Replay Attacks and Defenses Against Cross-shard Consensus in Sharded Distributed Ledgers

Alberto Sonnino, Shehar Bano, Mustafa Al-Bassam et al.

We present a family of replay attacks against sharded distributed ledgers, that target cross-shard consensus protocols, such as the recently proposed Chainspace and Omniledger. They allow an attacker, with network access only, to double-spend or lock resources with minimal efforts. The attacker can act independently without colluding with any nodes, and succeed even if all nodes are honest; most of the attacks can also exhibit themselves as faults under periods of asynchrony. These attacks are effective against both shard-led and client-led cross-shard consensus approaches. Finally, we present Byzcuit - a new cross-shard consensus protocol that is immune to those attacks. We implement a prototype of Byzcuit and evaluate it on a real cloud-based testbed, showing that our defenses impact performance minimally, and overall performance surpasses previous works.

CRSep 5, 2018

Blockmania: from Block DAGs to Consensus

George Danezis, David Hrycyszyn

Blockmania is a byzantine consensus protocol. Nodes emit blocks forming a directed acyclic graph (block DAG) that is subsequently interpreted by each node separately to ensure consensus with safety, liveness and finality. The resulting system has communication complexity $O(N^2)$ even in the worse case, and very low constant factors --- as compared to $O(N^4)$ for PBFT; it is leaderless; and network operations do not depend on the composition of the quorum or node stake. This makes Blockmania very efficient (leading to over 400K transactions per second on a wide area network), and ideal for dynamic membership and flexible and non-interrupted proof-of-stake protocols. A X-Blockmania variant, has $O(N)$ communication cost but also higher latency $O(\log N)$.

CRFeb 23, 2018

TARANET: Traffic-Analysis Resistant Anonymity at the NETwork layer

Chen Chen, Daniele E. Asoni, Adrian Perrig et al.

Modern low-latency anonymity systems, no matter whether constructed as an overlay or implemented at the network layer, offer limited security guarantees against traffic analysis. On the other hand, high-latency anonymity systems offer strong security guarantees at the cost of computational overhead and long delays, which are excessive for interactive applications. We propose TARANET, an anonymity system that implements protection against traffic analysis at the network layer, and limits the incurred latency and overhead. In TARANET's setup phase, traffic analysis is thwarted by mixing. In the data transmission phase, end hosts and ASes coordinate to shape traffic into constant-rate transmission using packet splitting. Our prototype implementation shows that TARANET can forward anonymous traffic at over 50~Gbps using commodity hardware.

CRFeb 20, 2018

Coconut: Threshold Issuance Selective Disclosure Credentials with Applications to Distributed Ledgers

Alberto Sonnino, Mustafa Al-Bassam, Shehar Bano et al.

Coconut is a novel selective disclosure credential scheme supporting distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. Coconut integrates with blockchains to ensure confidentiality, authenticity and availability even when a subset of credential issuing authorities are malicious or offline. We implement and evaluate a generic Coconut smart contract library for Chainspace and Ethereum; and present three applications related to anonymous payments, electronic petitions, and distribution of proxies for censorship resistance. Coconut uses short and computationally efficient credentials, and our evaluation shows that most Coconut cryptographic primitives take just a few milliseconds on average, with verification taking the longest time (10 milliseconds).

CRNov 10, 2017

Consensus in the Age of Blockchains

Shehar Bano, Alberto Sonnino, Mustafa Al-Bassam et al.

The blockchain initially gained traction in 2008 as the technology underlying bitcoin, but now has been employed in a diverse range of applications and created a global market worth over $150B as of 2017. What distinguishes blockchains from traditional distributed databases is the ability to operate in a decentralized setting without relying on a trusted third party. As such their core technical component is consensus: how to reach agreement among a group of nodes. This has been extensively studied already in the distributed systems community for closed systems, but its application to open blockchains has revitalized the field and led to a plethora of new designs. The inherent complexity of consensus protocols and their rapid and dramatic evolution makes it hard to contextualize the design landscape. We address this challenge by conducting a systematic and comprehensive study of blockchain consensus protocols. After first discussing key themes in classical consensus protocols, we describe: first protocols based on proof-of-work (PoW), second proof-of-X (PoX) protocols that replace PoW with more energy-efficient alternatives, and third hybrid protocols that are compositions or variations of classical consensus protocols. We develop a framework to evaluate their performance, security and design properties, and use it to systematize key themes in the protocol categories described above. This evaluation leads us to identify research gaps and challenges for the community to consider in future research endeavours.

CRSep 12, 2017

A Touch of Evil: High-Assurance Cryptographic Hardware from Untrusted Components

Vasilios Mavroudis, Andrea Cerulli, Petr Svenda et al.

The semiconductor industry is fully globalized and integrated circuits (ICs) are commonly defined, designed and fabricated in different premises across the world. This reduces production costs, but also exposes ICs to supply chain attacks, where insiders introduce malicious circuitry into the final products. Additionally, despite extensive post-fabrication testing, it is not uncommon for ICs with subtle fabrication errors to make it into production systems. While many systems may be able to tolerate a few byzantine components, this is not the case for cryptographic hardware, storing and computing on confidential data. For this reason, many error and backdoor detection techniques have been proposed over the years. So far all attempts have been either quickly circumvented, or come with unrealistically high manufacturing costs and complexity. This paper proposes Myst, a practical high-assurance architecture, that uses commercial off-the-shelf (COTS) hardware, and provides strong security guarantees, even in the presence of multiple malicious or faulty components. The key idea is to combine protective-redundancy with modern threshold cryptographic techniques to build a system tolerant to hardware trojans and errors. To evaluate our design, we build a Hardware Security Module that provides the highest level of assurance possible with COTS components. Specifically, we employ more than a hundred COTS secure crypto-coprocessors, verified to FIPS140-2 Level 4 tamper-resistance standards, and use them to realize high-confidentiality random number generation, key derivation, public key decryption and signing. Our experiments show a reasonable computational overhead (less than 1% for both Decryption and Signing) and an exponential increase in backdoor-tolerance as more ICs are added.

CRSep 4, 2017

Mix-ORAM: Using delegate shuffles

Raphael R. Toledo, George Danezis, Isao Echizen

Oblivious RAM (ORAM) is a key technology for providing private storage and querying on untrusted machines but is commonly seen as impractical due to the high overhead of the re-randomization, called the eviction, the client incurs. We propose in this work to securely delegate the eviction to semi-trusted third parties to enable any client to accede the ORAM technology and present four different designs inspired by mix-net technologies with reasonable periodic costs.

CRAug 17, 2017

Learning Universal Adversarial Perturbations with Generative Models

Jamie Hayes, George Danezis

Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally perturbed to remain visually similar to the source input, but cause a misclassification. It was recently shown that given a dataset and classifier, there exists so called universal adversarial perturbations, a single perturbation that causes a misclassification when applied to any input. In this work, we introduce universal adversarial networks, a generative network that is capable of fooling a target classifier when it's generated output is added to a clean sample from a dataset. We show that this technique improves on known universal adversarial attacks.

CRAug 12, 2017

Chainspace: A Sharded Smart Contracts Platform

Mustafa Al-Bassam, Alberto Sonnino, Shehar Bano et al.

Chainspace is a decentralized infrastructure, known as a distributed ledger, that supports user defined smart contracts and executes user-supplied transactions on their objects. The correct execution of smart contract transactions is verifiable by all. The system is scalable, by sharding state and the execution of transactions, and using S-BAC, a distributed commit protocol, to guarantee consistency. Chainspace is secure against subsets of nodes trying to compromise its integrity or availability properties through Byzantine Fault Tolerance (BFT), and extremely high-auditability, non-repudiation and `blockchain' techniques. Even when BFT fails, auditing mechanisms are in place to trace malicious participants. We present the design, rationale, and details of Chainspace; we argue through evaluating an implementation of the system about its scaling and other features; we illustrate a number of privacy-friendly smart contracts for smart metering, polling and banking and measure their performance.

CRJul 19, 2017

ClaimChain: Improving the Security and Privacy of In-band Key Distribution for Messaging

Bogdan Kulynych, Wouter Lueks, Marios Isaakidis et al.

The social demand for email end-to-end encryption is barely supported by mainstream service providers. Autocrypt is a new community-driven open specification for e-mail encryption that attempts to respond to this demand. In Autocrypt the encryption keys are attached directly to messages, and thus the encryption can be implemented by email clients without any collaboration of the providers. The decentralized nature of this in-band key distribution, however, makes it prone to man-in-the-middle attacks and can leak the social graph of users. To address this problem we introduce ClaimChain, a cryptographic construction for privacy-preserving authentication of public keys. Users store claims about their identities and keys, as well as their beliefs about others, in ClaimChains. These chains form authenticated decentralized repositories that enable users to prove the authenticity of both their keys and the keys of their contacts. ClaimChains are encrypted, and therefore protect the stored information, such as keys and contact identities, from prying eyes. At the same time, ClaimChain implements mechanisms to provide strong non-equivocation properties, discouraging malicious actors from distributing conflicting or inauthentic claims. We implemented ClaimChain and we show that it offers reasonable performance, low overhead, and authenticity guarantees.

CRMay 22, 2017

LOGAN: Membership Inference Attacks Against Generative Models

Jamie Hayes, Luca Melis, George Danezis et al.

Generative models estimate the underlying distribution of a dataset to generate realistic samples according to that distribution. In this paper, we present the first membership inference attacks against generative models: given a data point, the adversary determines whether or not it was used to train the model. Our attacks leverage Generative Adversarial Networks (GANs), which combine a discriminative and a generative model, to detect overfitting and recognize inputs that were part of training datasets, using the discriminator's capacity to learn statistical differences in distributions. We present attacks based on both white-box and black-box access to the target model, against several state-of-the-art generative models, over datasets of complex representations of faces (LFW), objects (CIFAR-10), and medical images (Diabetic Retinopathy). We also discuss the sensitivity of the attacks to different training parameters, and their robustness against mitigation strategies, finding that defenses are either ineffective or lead to significantly worse performances of the generative models in terms of training stability and/or sample quality.

CRApr 26, 2017

Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments

Carmela Troncoso, Marios Isaakidis, George Danezis et al.

Decentralized systems are a subset of distributed systems where multiple authorities control different components and no authority is fully trusted by all. This implies that any component in a decentralized system is potentially adversarial. We revise fifteen years of research on decentralization and privacy, and provide an overview of key systems, as well as key insights for designers of future systems. We show that decentralized designs can enhance privacy, integrity, and availability but also require careful trade-offs in terms of system complexity, properties provided, and degree of decentralization. These trade-offs need to be understood and navigated by designers. We argue that a combination of insights from cryptography, distributed systems, and mechanism design, aligned with the development of adequate incentives, are necessary to build scalable and successful privacy-preserving decentralized systems.

CRMar 1, 2017

The Loopix Anonymity System

Ania Piotrowska, Jamie Hayes, Tariq Elahi et al.

We present Loopix, a low-latency anonymous communication system that provides bi-directional 'third-party' sender and receiver anonymity and unobservability. Loopix leverages cover traffic and brief message delays to provide anonymity and achieve traffic analysis resistance, including against a global network adversary. Mixes and clients self-monitor the network via loops of traffic to provide protection against active attacks, and inject cover traffic to provide stronger anonymity and a measure of sender and receiver unobservability. Service providers mediate access in and out of a stratified network of Poisson mix nodes to facilitate accounting and off-line message reception, as well as to keep the number of links in the system low, and to concentrate cover traffic. We provide a theoretical analysis of the Poisson mixing strategy as well as an empirical evaluation of the anonymity provided by the protocol and a functional implementation that we analyze in terms of scalability by running it on AWS EC2. We show that a Loopix relay can handle upwards of 300 messages per second, at a small delay overhead of less than 1.5 ms on top of the delays introduced into messages to provide security. Overall message latency is in the order of seconds - which is low for a mix-system. Furthermore, many mix nodes can be securely added to a stratified topology to scale throughput without sacrificing anonymity.

MLMar 1, 2017

Generating Steganographic Images via Adversarial Training

Jamie Hayes, George Danezis

Adversarial training was recently shown to be competitive against supervised learning methods on computer vision tasks, however, studies have mainly been confined to generative tasks such as image synthesis. In this paper, we apply adversarial training techniques to the discriminative task of learning a steganographic algorithm. Steganography is a collection of techniques for concealing information by embedding it within a non-secret medium, such as cover texts or images. We show that adversarial training can produce robust steganographic techniques: our unsupervised training scheme produces a steganographic algorithm that competes with state-of-the-art steganographic techniques, and produces a robust steganalyzer, which performs the discriminative task of deciding if an image contains secret information. We define a game between three parties, Alice, Bob and Eve, in order to simultaneously train both a steganographic algorithm and a steganalyzer. Alice and Bob attempt to communicate a secret message contained within an image, while Eve eavesdrops on their conversation and attempts to determine if secret information is embedded within the image. We represent Alice, Bob and Eve by neural networks, and validate our scheme on two independent image datasets, showing our novel method of studying steganographic problems is surprisingly competitive against established steganographic techniques.

CRNov 9, 2016

A Public Comment on NCCoE's White Paper on Privacy-Enhancing Identity Brokers

Luís T. A. N. Brandão, Nicolas Christin, George Danezis

The National Cybersecurity Center of Excellence (NCCoE) (in the United States) has published on October 19, 2015, a white paper on "privacy-enhanced identity brokers." We present here a reply to their request for public comments. We enumerate concerns whose consideration we find paramount for the design of a privacy-enhancing identity brokering solution, for identification and authentication of citizens into myriad online services, and we recommend how to incorporate them into a revised white paper. Our observations, focused on privacy, security, auditability and forensics, are mostly based on a recently published research paper (PETS 2015) about two nation-scale brokered identification systems.

IRApr 1, 2016

Lower-Cost epsilon-Private Information Retrieval

Raphael R. Toledo, George Danezis, Ian Goldberg

Private Information Retrieval (PIR), despite being well studied, is computationally costly and arduous to scale. We explore lower-cost relaxations of information-theoretic PIR, based on dummy queries, sparse vectors, and compositions with an anonymity system. We prove the security of each scheme using a flexible differentially private definition for private queries that can capture notions of imperfect privacy. We show that basic schemes are weak, but some of them can be made arbitrarily safe by composing them with large anonymity systems.

CRSep 2, 2015

k-fingerprinting: a Robust Scalable Website Fingerprinting Technique

Jamie Hayes, George Danezis

Website fingerprinting enables an attacker to infer which web page a client is browsing through encrypted or anonymized network connections. We present a new website fingerprinting technique based on random decision forests and evaluate performance over standard web pages as well as Tor hidden services, on a larger scale than previous works. Our technique, k-fingerprinting, performs better than current state-of-the-art attacks even against website fingerprinting defenses, and we show that it is possible to launch a website fingerprinting attack in the face of a large amount of noisy data. We can correctly determine which of 30 monitored hidden services a client is visiting with 85% true positive rate (TPR), a false positive rate (FPR) as low as 0.02%, from a world size of 100,000 unmonitored web pages. We further show that error rates vary widely between web resources, and thus some patterns of use will be predictably more vulnerable to attack than others.

CRAug 25, 2015

Efficient Private Statistics with Succinct Sketches

Luca Melis, George Danezis, Emiliano De Cristofaro

Large-scale collection of contextual information is often essential in order to gather statistics, train machine learning models, and extract knowledge from data. The ability to do so in a {\em privacy-preserving} way -- i.e., without collecting fine-grained user data -- enables a number of additional computational scenarios that would be hard, or outright impossible, to realize without strong privacy guarantees. In this paper, we present the design and implementation of practical techniques for privately gathering statistics from large data streams. We build on efficient cryptographic protocols for private aggregation and on data structures for succinct data representation, namely, Count-Min Sketch and Count Sketch. These allow us to reduce the communication and computation complexity incurred by each data source (e.g., end-users) from linear to logarithmic in the size of their input, while introducing a parametrized upper-bounded error that does not compromise the quality of the statistics. We then show how to use our techniques, efficiently, to instantiate real-world privacy-friendly systems, supporting recommendations for media streaming services, prediction of user locations, and computation of median statistics for Tor hidden services.

CRJul 21, 2015

HORNET: High-speed Onion Routing at the Network Layer

Chen Chen, Daniele Enrico Asoni, David Barrera et al.

We present HORNET, a system that enables high-speed end-to-end anonymous channels by leveraging next generation network architectures. HORNET is designed as a low-latency onion routing system that operates at the network layer thus enabling a wide range of applications. Our system uses only symmetric cryptography for data forwarding yet requires no per-flow state on intermediate nodes. This design enables HORNET nodes to process anonymous traffic at over 93 Gb/s. HORNET can also scale as required, adding minimal processing overhead per additional anonymous channel. We discuss design and implementation details, as well as a performance and security evaluation.

CRMay 26, 2015

Centrally Banked Cryptocurrencies

George Danezis, Sarah Meiklejohn

Current cryptocurrencies, starting with Bitcoin, build a decentralized blockchain-based transaction ledger, maintained through proofs-of-work that also generate a monetary supply. Such decentralization has benefits, such as independence from national political control, but also significant limitations in terms of scalability and computational cost. We introduce RSCoin, a cryptocurrency framework in which central banks maintain complete control over the monetary supply, but rely on a distributed set of authorities, or mintettes, to prevent double-spending. While monetary policy is centralized, RSCoin still provides strong transparency and auditability guarantees. We demonstrate, both theoretically and experimentally, the benefits of a modest degree of centralization, such as the elimination of wasteful hashing and a scalable system for avoiding double-spending attacks.

CRFeb 26, 2015

Detecting Malware with Information Complexity

Nadia Alshahwan, Earl T. Barr, David Clark et al.

This work focuses on a specific front of the malware detection arms-race, namely the detection of persistent, disk-resident malware. We exploit normalised compression distance (NCD), an information theoretic measure, applied directly to binaries. Given a zoo of labelled malware and benign-ware, we ask whether a suspect program is more similar to our malware or to our benign-ware. Our approach classifies malware with 97.1% accuracy and a false positive rate of 3%. We achieve our results with off-the-shelf compressors and a standard machine learning classifier and without any specialised knowledge. An end-user need only collect a zoo of malware and benign-ware and then can immediately apply our techniques. We apply statistical rigour to our experiments and our selection of data. We demonstrate that accuracy can be optimised by combining NCD with the compressibility rates of the executables. We demonstrate that malware reported within a more narrow time frame of a few days is more homogenous than malware reported over a longer one of two years but that our method still classifies the latter with 95.2% accuracy and a 5% false positive rate. Due to the use of compression, the time and computation cost of our method is non-trivial. We show that simple approximation techniques can improve the time complexity of our approach by up to 63%. We compare our results to the results of applying the 59 anti-malware programs used on the VirusTotal web site to our malware. Our approach does better than any single one of them as well as the 59 used collectively.

CRJan 12, 2015

Privacy and Data Protection by Design - from policy to engineering

George Danezis, Josep Domingo-Ferrer, Marit Hansen et al.

Privacy and data protection constitute core values of individuals and of democratic societies. There have been decades of debate on how those values -and legal obligations- can be embedded into systems, preferably from the very beginning of the design process. One important element in this endeavour are technical mechanisms, known as privacy-enhancing technologies (PETs). Their effectiveness has been demonstrated by researchers and in pilot implementations. However, apart from a few exceptions, e.g., encryption became widely used, PETs have not become a standard and widely used component in system design. Furthermore, for unfolding their full benefit for privacy and data protection, PETs need to be rooted in a data governance strategy to be applied in practice. This report contributes to bridging the gap between the legal framework and the available technological implementation measures by providing an inventory of existing approaches, privacy design strategies, and technical building blocks of various degrees of maturity from research and development. Starting from the privacy principles of the legislation, important elements are presented as a first step towards a design process for privacy-friendly systems and services. The report sketches a method to map legal obligations to design strategies, which allow the system designer to select appropriate techniques for implementing the identified privacy requirements. Furthermore, the report reflects limitations of the approach. It concludes with recommendations on how to overcome and mitigate these limits.

CRAug 6, 2014

An Automated Social Graph De-anonymization Technique

Kumar Sharad, George Danezis

We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs. The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples. Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought. Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge. We show that the technique is effective even when only small numbers of samples are used for training. Further, since it detects weaknesses in the black-box anonymization scheme it can re-identify nodes in one social network when trained on another.