CYJul 25, 2020Code
Corona-Warn-App: Tracing the Start of the Official COVID-19 Exposure Notification App for GermanyJens Helge Reelfs, Oliver Hohlfeld, Ingmar Poese
On June 16, 2020, Germany launched an open-source smartphone contact tracing app ("Corona-Warn-App") to help tracing SARS-CoV-2 (coronavirus) infection chains. It uses a decentralized, privacy-preserving design based on the Exposure Notification APIs in which a centralized server is only used to distribute a list of keys of SARS-CoV-2 infected users that is fetched by the app once per day. Its success, however, depends on its adoption. In this poster, we characterize the early adoption of the app using Netflow traces captured directly at its hosting infrastructure. We show that the app generated traffic from allover Germany---already on the first day. We further observe that local COVID-19 outbreaks do not result in noticeable traffic increases.
HCApr 29, 2019Code
TheFragebogen: A Web Browser-based Questionnaire Framework for Scientific ResearchDennis Guse, Henrique R. Orefice, Gabriel Reimers et al.
Quality of Experience (QoE) typically involves conducting experiments in which stimuli are presented to participants and their judgments as well as behavioral data are collected. Nowadays, many experiments require software for the presentation of stimuli and the data collection from participants. While different software solutions exist, these are not tailored to conduct experiments on QoE. Moreover, replicating experiments or repeating the same experiment in different settings (e. g., laboratory vs. crowdsourcing) can further increase the software complexity. TheFragebogen is an open-source, versatile, extendable software framework for the implementation of questionnaires - especially for research on QoE. Implemented questionnaires can be presented with a state-of-the-art web browser to support a broad range of devices while the use of a web server being optional. Out-of-the-box, TheFragebogen provides graphical exact scales as well as free-hand input, the ability to collect behavioral data, and playback multimedia content.
CRSep 14, 2021
CyberBunker 2.0 -- A Domain and Traffic Perspective on a Bulletproof HosterDaniel Kopp, Eric Strehle, Oliver Hohlfeld
In September 2019, 600 armed German cops seized the physical premise of a Bulletproof Hoster (BPH) referred to as CyberBunker 2.0. The hoster resided in a decommissioned NATO bunker and advertised to host everything but child porn and anything related to terrorism while keeping servers online no matter what. While the anatomy, economics and interconnection-level characteristics of BPHs are studied, their traffic characteristics are unknown. In this poster, we present the first analysis of domains, web pages, and traffic captured at a major tier-1 ISP and a large IXP at the time when the CyberBunker was in operation. Our study sheds light on traffic characteristics of a BPH in operation. We show that a traditional BGP-based BPH identification approach cannot detect the CyberBunker, but find characteristics from a domain and traffic perspective that can add to future identification approaches.
CRMar 7, 2021
DDoS Never Dies? An IXP Perspective on DDoS Amplification AttacksDaniel Kopp, Christoph Dietzel, Oliver Hohlfeld
DDoS attacks remain a major security threat to the continuous operation of Internet edge infrastructures, web services, and cloud platforms. While a large body of research focuses on DDoS detection and protection, to date we ultimately failed to eradicate DDoS altogether. Yet, the landscape of DDoS attack mechanisms is even evolving, demanding an updated perspective on DDoS attacks in the wild. In this paper, we identify up to 2608 DDoS amplification attacks at a single day by analyzing multiple Tbps of traffic flows at a major IXP with a rich ecosystem of different networks. We observe the prevalence of well-known amplification attack protocols (e.g., NTP, CLDAP), which should no longer exist given the established mitigation strategies. Nevertheless, they pose the largest fraction on DDoS amplification attacks within our observation and we witness the emergence of DDoS attacks using recently discovered amplification protocols (e.g., OpenVPN, ARMS, Ubiquity Discovery Protocol). By analyzing the impact of DDoS on core Internet infrastructure, we show that DDoS can overload backbone-capacity and that filtering approaches in prior work omit 97% of the attack traffic.
SIMar 1, 2021
Understanding & Predicting User Lifetime with Machine Learning in an Anonymous Location-Based Social NetworkJens Helge Reelfs, Max Bergmann, Oliver Hohlfeld et al.
In this work, we predict the user lifetime within the anonymous and location-based social network Jodel in the Kingdom of Saudi Arabia. Jodel's location-based nature yields to the establishment of disjoint communities country-wide and enables for the first time the study of user lifetime in the case of a large set of disjoint communities. A user's lifetime is an important measurement for evaluating and steering customer bases as it can be leveraged to predict churn and possibly apply suitable methods to circumvent potential user losses. We train and test off the shelf machine learning techniques with 5-fold crossvalidation to predict user lifetime as a regression and classification problem; identifying the Random Forest to provide very strong results. Discussing model complexity and quality trade-offs, we also dive deep into a time-dependent feature subset analysis, which does not work very well; Easing up the classification problem into a binary decision (lifetime longer than timespan $x$) enables a practical lifetime predictor with very good performance. We identify implicit similarities across community models according to strong correlations in feature importance. A single countrywide model generalizes the problem and works equally well for any tested community; the overall model internally works similar to others also indicated by its feature importances.
CRSep 18, 2020
The Boon and Bane of Cross-Signing: Shedding Light on a Common Practice in Public Key InfrastructuresJens Hiller, Johanna Amann, Oliver Hohlfeld
Public Key Infrastructures (PKIs) with their trusted Certificate Authorities (CAs) provide the trust backbone for the Internet: CAs sign certificates which prove the identity of servers, applications, or users. To be trusted by operating systems and browsers, a CA has to undergo lengthy and costly validation processes. Alternatively, trusted CAs can cross-sign other CAs to extend their trust to them. In this paper, we systematically analyze the present and past state of cross-signing in the Web PKI. Our dataset (derived from passive TLS monitors and public CT logs) encompasses more than 7 years and 225 million certificates with 9.3 billion trust paths. We show benefits and risks of cross-signing. We discuss the difficulty of revoking trusted CA certificates where, worrisome, cross-signing can result in valid trust paths to remain after revocation; a problem for non-browser software that often blindly trusts all CA certificates and ignores revocations. However, cross-signing also enables fast bootstrapping of new CAs, e.g., Let's Encrypt, and achieves a non-disruptive user experience by providing backward compatibility. In this paper, we propose new rules and guidance for cross-signing to preserve its positive potential while mitigating its risks.
CLMay 19, 2020
Word-Emoji Embeddings from large scale Messaging Data reflect real-world Semantic Associations of Expressive IconsJens Helge Reelfs, Oliver Hohlfeld, Markus Strohmaier et al.
We train word-emoji embeddings on large scale messaging data obtained from the Jodel online social network. Our data set contains more than 40 million sentences, of which 11 million sentences are annotated with a subset of the Unicode 13.0 standard Emoji list. We explore semantic emoji associations contained in this embedding by analyzing associations between emojis, between emojis and text, and between text and emojis. Our investigations demonstrate anecdotally that word-emoji embeddings trained on large scale messaging data can reflect real-world semantic associations. To enable further research we release the Jodel Emoji Embedding Dataset (JEED1488) containing 1488 emojis and their embeddings along 300 dimensions.
HCMay 1, 2020
Multi-episodic Perceived Quality of an Audio-on-Demand ServiceDennis Guse, Oliver Hohlfeld, Anna Wunderlich et al.
QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underlying quality formation processes and its factors are still to be discovered. We present a multi-episodic experiment of an Audio on Demand service over a usage period of 6~days with 93 participants. Our work directly extends prior work investigating the impact of time between usage episodes. The results show similar effects---also the recency effect is not statistically significant. In addition, we extend prediction of multi-episodic judgments by accounting for the observed saturation.
NIJul 4, 2019
DeePCCI: Deep Learning-based Passive Congestion Control IdentificationConstantin Sander, Jan Rüth, Oliver Hohlfeld et al.
Transport protocols use congestion control to avoid overloading a network. Nowadays, different congestion control variants exist that influence performance. Studying their use is thus relevant, but it is hard to identify which variant is used. While passive identification approaches exist, these require detailed domain knowledge and often also rely on outdated assumptions about how congestion control operates and what data is accessible. We present DeePCCI, a passive, deep learning-based congestion control identification approach which does not need any domain knowledge other than training traffic of a congestion control variant. By only using packet arrival data, it is also directly applicable to encrypted (transport header) traffic. DeePCCI is therefore more easily extendable and can also be used with QUIC.
CRAug 2, 2018
Digging into Browser-based Crypto MiningJan Rüth, Torsten Zimmermann, Konrad Wolsing et al.
Mining is the foundation of blockchain-based cryptocurrencies such as Bitcoin rewarding the miner for finding blocks for new transactions. The Monero currency enables mining with standard hardware in contrast to special hardware (ASICs) as often used in Bitcoin, paving the way for in-browser mining as a new revenue model for website operators. In this work, we study the prevalence of this new phenomenon. We identify and classify mining websites in 138M domains and present a new fingerprinting method which finds up to a factor of 5.7 more miners than publicly available block lists. Our work identifies and dissects Coinhive as the major browser-mining stakeholder. Further, we present a new method to associate mined blocks in the Monero blockchain to mining pools and uncover that Coinhive currently contributes 1.18% of mined blocks having turned over 1293 Moneros in June 2018.