CRMay 13
SaTor: Exploring Satellite Routing in Tor to Reduce LatencyHaozhi Li, Tariq Elahi
High latency is a critical limitation within the Tor network that has a negative impact on web application responsiveness. A key factor exacerbating Tor latency is the creation of lengthy circuits that span across geographically distant regions, causing significant transmission delays. A common solution involves modifying Tor's circuit-building process to reduce the likelihood of selecting lengthy circuits. However, this strategy compromises Tor's routing randomness, increasing the risk of deanonymization. Reducing Tor's latency while minimizing security degradation presents a challenge. This paper proposes and investigates SaTor, a satellite-assisted routing scheme to reduce Tor latency. By equipping Tor relays with satellite network access, SaTor could accelerate slow circuits via satellite transmission, without biasing the existing path selection process. Our performance evaluation, using a simulator we developed along with real-world measurements, shows that over the long term, SaTor provides an expected speed-up of 21.8 ms for over 40% of circuits, with only 100 top relays equipped with satellite service. Our research uncovers a viable way to overcome Tor's latency bottleneck, serving as a practical reference for its future enhancement.
LGNov 9, 2023
Accelerated Shapley Value Approximation for Data EvaluationLauren Watson, Zeno Kujawa, Rayna Andreeva et al.
Data valuation has found various applications in machine learning, such as data filtering, efficient learning and incentives for data sharing. The most popular current approach to data valuation is the Shapley value. While popular for its various applications, Shapley value is computationally expensive even to approximate, as it requires repeated iterations of training models on different subsets of data. In this paper we show that the Shapley value of data points can be approximated more efficiently by leveraging the structural properties of machine learning problems. We derive convergence guarantees on the accuracy of the approximate Shapley value for different learning settings including Stochastic Gradient Descent with convex and non-convex loss functions. Our analysis suggests that in fact models trained on small subsets are more important in the context of data valuation. Based on this idea, we describe $δ$-Shapley -- a strategy of only using small subsets for the approximation. Experiments show that this approach preserves approximate value and rank of data, while achieving speedup of up to 9.9x. In pre-trained networks the approach is found to bring more efficiency in terms of accurate evaluation using small subsets.
CRMar 1, 2017
The Loopix Anonymity SystemAnia Piotrowska, Jamie Hayes, Tariq Elahi et al.
We present Loopix, a low-latency anonymous communication system that provides bi-directional 'third-party' sender and receiver anonymity and unobservability. Loopix leverages cover traffic and brief message delays to provide anonymity and achieve traffic analysis resistance, including against a global network adversary. Mixes and clients self-monitor the network via loops of traffic to provide protection against active attacks, and inject cover traffic to provide stronger anonymity and a measure of sender and receiver unobservability. Service providers mediate access in and out of a stratified network of Poisson mix nodes to facilitate accounting and off-line message reception, as well as to keep the number of links in the system low, and to concentrate cover traffic. We provide a theoretical analysis of the Poisson mixing strategy as well as an empirical evaluation of the anonymity provided by the protocol and a functional implementation that we analyze in terms of scalability by running it on AWS EC2. We show that a Loopix relay can handle upwards of 300 messages per second, at a small delay overhead of less than 1.5 ms on top of the delays introduced into messages to provide security. Overall message latency is in the order of seconds - which is low for a mix-system. Furthermore, many mix nodes can be securely added to a stratified topology to scale throughput without sacrificing anonymity.
CRDec 4, 2014
Censorship Resistance: Let a Thousand Flowers Bloom?Tariq Elahi, Steven J. Murdoch, Ian Goldberg
This paper argues that one of the most important decisions in designing and deploying censorship resistance systems is whether one set of system options should be selected (the best), or whether there should be several sets of good ones. We model the problem of choosing these options as a cat-and-mouse game and show that the best strategy depends on the value the censor associates with total system censorship versus partial, and the tolerance of false positives. If the censor has a low tolerance to false positives then choosing one censorship resistance system is best. Otherwise choosing several systems is the better choice, but the way traffic should be distributed over the systems depends on the tolerance of the censor to false negatives. We demonstrate that establishing the censor's utility function is critical to discovering the best strategy for censorship resistance.