CRJul 30, 2019
Clash of the Trackers: Measuring the Evolution of the Online Tracking EcosystemKonstantinos Solomos, Panagiotis Ilia, Sotiris Ioannidis et al.
Websites are constantly adapting the methods used, and intensity with which they track online visitors. However, the wide-range enforcement of GDPR since one year ago (May 2018) forced websites serving EU-based online visitors to eliminate or at least reduce such tracking activity, given they receive proper user consent. Therefore, it is important to record and analyze the evolution of this tracking activity and assess the overall "privacy health" of the Web ecosystem and if it is better after GDPR enforcement. This work makes a significant step towards this direction. In this paper, we analyze the online ecosystem of 3rd-parties embedded in top websites which amass the majority of online tracking through 6 time snapshots taken every few months apart, in the duration of the last 2 years. We perform this analysis in three ways: 1) by looking into the network activity that 3rd-parties impose on each publisher hosting them, 2) by constructing a bipartite graph of "publisher-to-tracker", connecting 3rd parties with their publishers, 3) by constructing a "tracker-to-tracker" graph connecting 3rd-parties who are commonly found in publishers. We record significant changes through time in number of trackers, traffic induced in publishers (incoming vs. outgoing), embeddedness of trackers in publishers, popularity and mixture of trackers across publishers. We also report how such measures compare with the ranking of publishers based on Alexa. On the last level of our analysis, we dig deeper and look into the connectivity of trackers with each other and how this relates to potential cookie synchronization activity.
CYJan 3, 2019
Please Forget Where I Was Last Summer: The Privacy Risks of Public Location (Meta)DataKostas Drakonakis, Panagiotis Ilia, Sotiris Ioannidis et al.
The exposure of location data constitutes a significant privacy risk to users as it can lead to de-anonymization, the inference of sensitive information, and even physical threats. In this paper we present LPAuditor, a tool that conducts a comprehensive evaluation of the privacy loss caused by publicly available location metadata. First, we demonstrate how our system can pinpoint users' key locations at an unprecedented granularity by identifying their actual postal addresses. Our experimental evaluation on Twitter data highlights the effectiveness of our techniques which outperform prior approaches by 18.9%-91.6% for homes and 8.7%-21.8% for workplaces. Next we present a novel exploration of automated private information inference that uncovers "sensitive" locations that users have visited (pertaining to health, religion, and sex/nightlife). We find that location metadata can provide additional context to tweets and thus lead to the exposure of private information that might not match the users' intentions. We further explore the mismatch between user actions and information exposure and find that older versions of the official Twitter apps follow a privacy-invasive policy of including precise GPS coordinates in the metadata of tweets that users have geotagged at a coarse-grained level (e.g., city). The implications of this exposure are further exacerbated by our finding that users are considerably privacy-cautious in regards to exposing precise location data. When users can explicitly select what location data is published, there is a 94.6% reduction in tweets with GPS coordinates. As part of current efforts to give users more control over their data, LPAuditor can be adopted by major services and offered as an auditing tool that informs users about sensitive information they (indirectly) expose through location metadata.
CRDec 29, 2018
Talon: An Automated Framework for Cross-Device Tracking DetectionKonstantinos Solomos, Panagiotis Ilia, Sotiris Ioannidis et al.
Although digital advertising fuels much of today's free Web, it typically does so at the cost of online users' privacy, due to the continuous tracking and leakage of users' personal data. In search for new ways to optimize the effectiveness of ads, advertisers have introduced new advanced paradigms such as cross-device tracking (CDT), to monitor users' browsing on multiple devices and screens, and deliver (re)targeted ads in the most appropriate screen.Unfortunately, this practice leads to greater privacy concerns for the end-user. Going beyond the state-of-the-art, we propose a novel methodology for detecting CDT and measuring the factors affecting its performance, in a repeatable and systematic way. This new methodology is based on emulating realistic browsing activity of end-users, from different devices, and thus triggering and detecting cross-device targeted ads. We design and build Talon a CDT measurement framework that implements our methodology and allows experimentation with multiple parallel devices, experimental setups and settings. By employing Talon, we perform several critical experiments, and we are able to not only detect and measure CDT with average AUC score of 0.78-0.96, but also to provide significant insights about the behavior of CDT entities and the impact on users' privacy. In the hands of privacy researchers, policy makers and end-users, Talon can be an invaluable tool for raising awareness and increasing transparency on tracking practices used by the ad-ecosystem.
CRSep 30, 2018
Master of Web Puppets: Abusing Web Browsers for Persistent and Stealthy ComputationPanagiotis Papadopoulos, Panagiotis Ilia, Michalis Polychronakis et al.
The proliferation of web applications has essentially transformed modern browsers into small but powerful operating systems. Upon visiting a website, user devices run implicitly trusted script code, the execution of which is confined within the browser to prevent any interference with the user's system. Recent JavaScript APIs, however, provide advanced capabilities that not only enable feature-rich web applications, but also allow attackers to perform malicious operations despite the confined nature of JavaScript code execution. In this paper, we demonstrate the powerful capabilities that modern browser APIs provide to attackers by presenting MarioNet: a framework that allows a remote malicious entity to control a visitor's browser and abuse its resources for unwanted computation or harmful operations, such as cryptocurrency mining, password-cracking, and DDoS. MarioNet relies solely on already available HTML5 APIs, without requiring the installation of any additional software. In contrast to previous browser-based botnets, the persistence and stealthiness characteristics of MarioNet allow the malicious computations to continue in the background of the browser even after the user closes the window or tab of the initial malicious website. We present the design, implementation, and evaluation of a prototype system, MarioNet, that is compatible with all major browsers, and discuss potential defense strategies to counter the threat of such persistent in-browser attacks. Our main goal is to raise awareness regarding this new class of attacks, and inform the design of future browser APIs so that they provide a more secure client-side environment for web applications.
CRJun 6, 2018
Truth in Web Mining: Measuring the Profitability and Cost of Cryptominers as a Web Monetization ModelPanagiotis Papadopoulos, Panagiotis Ilia, Evangelos P. Markatos
The recent advances of web-based cryptomining libraries along with the whopping market value of cryptocoins have convinced an increasing number of publishers to switch to web mining as a source of monetization for their websites. The conditions could not be better nowadays: the inevitable arms race between adblockers and advertisers is at its peak with publishers caught in the crossfire. But, can cryptomining be the next primary monetization model in the post advertising era of free Internet? In this paper, we respond to this exact question. In particular, we compare the profitability of cryptomining and advertising to assess the most advantageous option for a content provider. In addition, we measure the costs imposed to the user in each case with regards to power consumption, resources utilization, network traffic, device temperature and user experience. Our results show that cryptomining can surpass the profitability of advertising under specific circumstances, however users need to sustain a significant cost on their devices.