Evangelos P. Markatos

CR
8papers
282citations
Novelty44%
AI Score25

8 Papers

CYDec 13, 2022
FNDaaS: Content-agnostic Detection of Fake News sites

Panagiotis Papadopoulos, Dimitris Spithouris, Evangelos P. Markatos et al.

Automatic fake news detection is a challenging problem in misinformation spreading, and it has tremendous real-world political and social impacts. Past studies have proposed machine learning-based methods for detecting such fake news, focusing on different properties of the published news articles, such as linguistic characteristics of the actual content, which however have limitations due to the apparent language barriers. Departing from such efforts, we propose Fake News Detection-as-a Service (FNDaaS), the first automatic, content-agnostic fake news detection method, that considers new and unstudied features such as network and structural characteristics per news website. This method can be enforced as-a-Service, either at the ISP-side for easier scalability and maintenance, or user-side for better end-user privacy. We demonstrate the efficacy of our method using more than 340K datapoints crawled from existing lists of 637 fake and 1183 real news websites, and by building and testing a proof of concept system that materializes our proposal. Our analysis of data collected from these websites shows that the vast majority of fake news domains are very young and appear to have lower time periods of an IP associated with their domain than real news ones. By conducting various experiments with machine learning classifiers, we demonstrate that FNDaaS can achieve an AUC score of up to 0.967 on past sites, and up to 77-92% accuracy on newly-flagged ones.

SIMar 16, 2021
The Rise and Fall of Fake News sites: A Traffic Analysis

Manolis Chalkiadakis, Alexandros Kornilakis, Panagiotis Papadopoulos et al.

Over the past decade, we have witnessed the rise of misinformation on the Internet, with online users constantly falling victims of fake news. A multitude of past studies have analyzed fake news diffusion mechanics and detection and mitigation techniques. However, there are still open questions about their operational behavior such as: How old are fake news websites? Do they typically stay online for long periods of time? Do such websites synchronize with each other their up and down time? Do they share similar content through time? Which third-parties support their operations? How much user traffic do they attract, in comparison to mainstream or real news websites? In this paper, we perform a first of its kind investigation to answer such questions regarding the online presence of fake news websites and characterize their behavior in comparison to real news websites. Based on our findings, we build a content-agnostic ML classifier for automatic detection of fake news websites (i.e. accuracy) that are not yet included in manually curated blacklists.

CYFeb 17, 2021
User Tracking in the Post-cookie Era: How Websites Bypass GDPR Consent to Track Users

Emmanouil Papadogiannakis, Panagiotis Papadopoulos, Nicolas Kourtellis et al.

During the past few years, mostly as a result of the GDPR and the CCPA, websites have started to present users with cookie consent banners. These banners are web forms where the users can state their preference and declare which cookies they would like to accept, if such option exists. Although requesting consent before storing any identifiable information is a good start towards respecting the user privacy, yet previous research has shown that websites do not always respect user choices. Furthermore, considering the ever decreasing reliance of trackers on cookies and actions browser vendors take by blocking or restricting third-party cookies, we anticipate a world where stateless tracking emerges, either because trackers or websites do not use cookies, or because users simply refuse to accept any. In this paper, we explore whether websites use more persistent and sophisticated forms of tracking in order to track users who said they do not want cookies. Such forms of tracking include first-party ID leaking, ID synchronization, and browser fingerprinting. Our results suggest that websites do use such modern forms of tracking even before users had the opportunity to register their choice with respect to cookies. To add insult to injury, when users choose to raise their voice and reject all cookies, user tracking only intensifies. As a result, users' choices play very little role with respect to tracking: we measured that more than 75% of tracking activities happened before users had the opportunity to make a selection in the cookie consent banner, or when users chose to reject all cookies.

CRNov 6, 2019
The coin that never sleeps. The privacy preserving usage of Bitcoin in a longitudinal analysis as a speculative asset

Emmanouil Karampinakis, Michalis Pachilakis, Panagiotis Papadopoulos et al.

Bitcoin is the first and undoubtedly most successful cryptocurrecny to date with a market capitalization of more than 100 billion dollars. Today, Bitcoin has more than 100,000 supporting merchants and more than 3 million active users. Besides the trust it enjoys among people, Bitcoin lacks of a basic feature a substitute currency must have: stability of value. Hence, although the use of Bitcoin as a mean of payment is relative low, yet the wild ups and downs of its value lure investors to use it as useful asset to yield a trading profit. In this study, we explore this exact nature of Bitcoin aiming to shed light in the newly emerged and rapid growing marketplace of cryptocurencies and compare the investmet landscape and patterns with the most popular traditional stock market of Dow Jones. Our results show that most of Bitcoin addresses are used in the correct fashion to preserve security and privacy of the transactions and that the 24/7 open market of Bitcoin is not affected by any political incidents of the offline world, in contrary with the traditional stock markets. Also, it seems that there are specific longitudes that lead the cryptocurrency in terms of bulk of transactions, but there is not the same correlation with the volume of the coins being transferred.

CRJul 24, 2019
YourAdvalue: Measuring Advertising Price Dynamics without Bankrupting User Privacy

Michalis Pachilakis, Panagiotis Papadopoulos, Nikolaos Laoutaris et al.

The Real Time Bidding (RTB) protocol is by now more than a decade old. During this time, a handful of measurement papers have looked at bidding strategies, personal information flow, and cost of display advertising through RTB. In this paper, we present YourAdvalue, a privacy-preserving tool for displaying to end-users in a simple and intuitive manner their advertising value as seen through RTB. Using YourAdvalue, we measure desktop RTB prices in the wild, and compare them with desktop and mobile RTB prices reported by past work. We present how it estimates ad prices that are encrypted, and how it preserves user privacy while reporting results back to a data-server for analysis. We deployed our system, disseminated its browser extension, and collected data from 200 users, including 12000 ad impressions over 11 months. By analyzing this dataset, we show that desktop RTB prices have grown 4.6X over desktop RTB prices measured in 2013, and 3.8X over mobile RTB prices measured in 2015. We also study how user demographics associate with the intensity of RTB ecosystem tracking, leading to higher ad prices. We find that exchanging data between advertisers and/or data brokers through cookie-synchronization increases the median value of displayed ads by 19%. We also find that female and younger users are more targeted, suffering more tracking (via cookie synchronization) than male or elder users. As a result of this targeting in our dataset, the advertising value (i) of women is 2.4X higher than that of men, (ii) of 25-34 year-olds is 2.5X higher than that of 35-44 year-olds, (iii) is most expensive on weekends and early mornings.

CRSep 30, 2018
Master of Web Puppets: Abusing Web Browsers for Persistent and Stealthy Computation

Panagiotis Papadopoulos, Panagiotis Ilia, Michalis Polychronakis et al.

The proliferation of web applications has essentially transformed modern browsers into small but powerful operating systems. Upon visiting a website, user devices run implicitly trusted script code, the execution of which is confined within the browser to prevent any interference with the user's system. Recent JavaScript APIs, however, provide advanced capabilities that not only enable feature-rich web applications, but also allow attackers to perform malicious operations despite the confined nature of JavaScript code execution. In this paper, we demonstrate the powerful capabilities that modern browser APIs provide to attackers by presenting MarioNet: a framework that allows a remote malicious entity to control a visitor's browser and abuse its resources for unwanted computation or harmful operations, such as cryptocurrency mining, password-cracking, and DDoS. MarioNet relies solely on already available HTML5 APIs, without requiring the installation of any additional software. In contrast to previous browser-based botnets, the persistence and stealthiness characteristics of MarioNet allow the malicious computations to continue in the background of the browser even after the user closes the window or tab of the initial malicious website. We present the design, implementation, and evaluation of a prototype system, MarioNet, that is compatible with all major browsers, and discuss potential defense strategies to counter the threat of such persistent in-browser attacks. Our main goal is to raise awareness regarding this new class of attacks, and inform the design of future browser APIs so that they provide a more secure client-side environment for web applications.

CRJun 6, 2018
Truth in Web Mining: Measuring the Profitability and Cost of Cryptominers as a Web Monetization Model

Panagiotis Papadopoulos, Panagiotis Ilia, Evangelos P. Markatos

The recent advances of web-based cryptomining libraries along with the whopping market value of cryptocoins have convinced an increasing number of publishers to switch to web mining as a source of monetization for their websites. The conditions could not be better nowadays: the inevitable arms race between adblockers and advertisers is at its peak with publishers caught in the crossfire. But, can cryptomining be the next primary monetization model in the post advertising era of free Internet? In this paper, we respond to this exact question. In particular, we compare the profitability of cryptomining and advertising to assess the most advantageous option for a content provider. In addition, we measure the costs imposed to the user in each case with regards to power consumption, resources utilization, network traffic, device temperature and user experience. Our results show that cryptomining can surpass the profitability of advertising under specific circumstances, however users need to sustain a significant cost on their devices.

IRMay 26, 2018
Cookie Synchronization: Everything You Always Wanted to Know But Were Afraid to Ask

Panagiotis Papadopoulos, Nicolas Kourtellis, Evangelos P. Markatos

User data is the primary input of digital advertising, fueling the free Internet as we know it. As a result, web companies invest a lot in elaborate tracking mechanisms to acquire user data that can sell to data markets and advertisers. However, with same-origin policy, and cookies as a primary identification mechanism on the web, each tracker knows the same user with a different ID. To mitigate this, Cookie Synchronization (CSync) came to the rescue, facilitating an information sharing channel between third parties that may or not have direct access to the website the user visits. In the background, with CSync, they merge user data they own, but also reconstruct a user's browsing history, bypassing the same origin policy. In this paper, we perform a first to our knowledge in-depth study of CSync in the wild, using a year-long weblog from 850 real mobile users. Through our study, we aim to understand the characteristics of the CSync protocol and the impact it has on web users' privacy. For this, we design and implement CONRAD, a holistic mechanism to detect CSync events at real time, and the privacy loss on the user side, even when the synced IDs are obfuscated. Using CONRAD, we find that 97% of the regular web users are exposed to CSync: most of them within the first week of their browsing, and the median userID gets leaked, on average, to 3.5 different domains. Finally, we see that CSync increases the number of domains that track the user by a factor of 6.75.