CLNov 11, 2023
THOS: A Benchmark Dataset for Targeted Hate and Offensive SpeechSaad Almohaimeed, Saleh Almohaimeed, Ashfaq Ali Shafin et al.
Detecting harmful content on social media, such as Twitter, is made difficult by the fact that the seemingly simple yes/no classification conceals a significant amount of complexity. Unfortunately, while several datasets have been collected for training classifiers in hate and offensive speech, there is a scarcity of datasets labeled with a finer granularity of target classes and specific targets. In this paper, we introduce THOS, a dataset of 8.3k tweets manually labeled with fine-grained annotations about the target of the message. We demonstrate that this dataset makes it feasible to train classifiers, based on Large Language Models, to perform classification at this level of granularity.
CRNov 24, 2021
SoK: Plausibly Deniable StorageChen Chen, Xiao Liang, Bogdan Carbunar et al.
Data privacy is critical in instilling trust and empowering the societal pacts of modern technology-driven democracies. Unfortunately, it is under continuous attack by overreaching or outright oppressive governments, including some of the world's oldest democracies. Increasingly-intrusive anti-encryption laws severely limit the ability of standard encryption to protect privacy. New defense mechanisms are needed. Plausible deniability (PD) is a powerful property, enabling users to hide the existence of sensitive information in a system under direct inspection by adversaries. Popular encrypted storage systems such as TrueCrypt and other research efforts have attempted to also provide plausible deniability. Unfortunately, these efforts have often operated under less well-defined assumptions and adversarial models. Careful analyses often uncover not only high overheads but also outright security compromise. Further, our understanding of adversaries, the underlying storage technologies, as well as the available plausible deniable solutions have evolved dramatically in the past two decades. The main goal of this work is to systematize this knowledge. It aims to: - identify key PD properties, requirements, and approaches; - present a direly-needed unified framework for evaluating security and performance; - explore the challenges arising from the critical interplay between PD and modern system layered stacks; - propose a new "trace-oriented" PD paradigm, able to decouple security guarantees from the underlying systems and thus ensure a higher level of flexibility and security independent of the technology stack. This work is meant also as a trusted guide for system and security practitioners around the major challenges in understanding, designing, and implementing plausible deniability into new or existing systems.
CRNov 19, 2021
RacketStore: Measurements of ASO Deception in Google Play via Mobile and App UsageNestor Hernandez, Ruben Recabarren, Bogdan Carbunar et al.
Online app search optimization (ASO) platforms that provide bulk installs and fake reviews for paying app developers in order to fraudulently boost their search rank in app stores, were shown to employ diverse and complex strategies that successfully evade state-of-the-art detection methods. In this paper we introduce RacketStore, a platform to collect data from Android devices of participating ASO providers and regular users, on their interactions with apps which they install from the Google Play Store. We present measurements from a study of 943 installs of RacketStore on 803 unique devices controlled by ASO providers and regular users, that consists of 58,362,249 data snapshots collected from these devices, the 12,341 apps installed on them and their 110,511,637 Google Play reviews. We reveal significant differences between ASO providers and regular users in terms of the number and types of user accounts registered on their devices, the number of apps they review, and the intervals between the installation times of apps and their review times. We leverage these insights to introduce features that model the usage of apps and devices, and show that they can train supervised learning algorithms to detect paid app installs and fake reviews with an F1-measure of 99.72% (AUC above 0.99), and detect devices controlled by ASO providers with an F1-measure of 95.29% (AUC = 0.95). We discuss the costs associated with evading detection by our classifiers and also the potential for app stores to use our approach to detect ASO work with privacy.
CROct 16, 2021
Toward Uncensorable, Anonymous and Private Access Over Satoshi BlockchainsRuben Recabarren, Bogdan Carbunar
Providing unrestricted access to sensitive content such as news and software is difficult in the presence of adaptive and resourceful surveillance and censoring adversaries. In this paper we leverage the distributed and resilient nature of commercial Satoshi blockchains to develop the first provably secure, censorship resistant, cost-efficient storage system with anonymous and private access, built on top of commercial cryptocurrency transactions. We introduce max-rate transactions, a practical construct to persist data of arbitrary size entirely in a Satoshi blockchain. We leverage max-rate transactions to develop UWeb, a blockchain-based storage system that charges publishers to self-sustain its decentralized infrastructure. UWeb organizes blockchainstored content for easy retrieval, and enables clients to store and access content with provable anonymity, privacy and censorship resistance properties. We present results from UWeb experiments with writing 268.21 MB of data into the live Litecoin blockchain, including 4.5 months of live-feed BBC articles, and 41 censorship resistant tools. The max-rate writing throughput (183 KB/s) and blockchain utilization (88%) exceed those of state-of-the-art solutions by 2-3 orders of magnitude and broke Litecoin's record of the daily average block size. Our simulations with up to 3,000 concurrent UWeb writers confirm that UWeb does not impact the confirmation delays of financial transactions.
CRSep 14, 2019
Private and Atomic Exchange of Assets over Zero Knowledge Based Payment LedgerZhimin Gao, Lei Xu, Keshav Kasichainula et al.
Bitcoin brings a new type of digital currency that does not rely on a central system to maintain transactions. By benefiting from the concept of decentralized ledger, users who do not know or trust each other can still conduct transactions in a peer-to-peer manner. Inspired by Bitcoin, other cryptocurrencies were invented in recent years such as Ethereum, Dash, Zcash, Monero, Grin, etc. Some of these focus on enhancing privacy for instance crypto note or systems that apply the similar concept of encrypted notes used for transactions to enhance privacy (e.g., Zcash, Monero). However, there are few mechanisms to support the exchange of privacy-enhanced notes or assets on the chain, and at the same time preserving the privacy of the exchange operations. Existing approaches for fair exchanges of assets with privacy mostly rely on off-chain/side-chain, escrow or centralized services. Thus, we propose a solution that supports oblivious and privacy-protected fair exchange of crypto notes or privacy enhanced crypto assets. The technology is demonstrated by extending zero-knowledge based crypto notes. To address "privacy" and "multi-currency", we build a new zero-knowledge proving system and extend note format with new property to represent various types of tokenized assets or cryptocurrencies. By extending the payment protocol, exchange operations are realized through privacy enhanced transactions (e.g., shielded transactions). Based on the possible scenarios during the exchange operation, we add new constraints and conditions to the zero-knowledge proving system used for validating transactions publicly.
CRSep 29, 2018
Tithonus: A Bitcoin Based Censorship Resilient SystemRuben Recabarren, Bogdan Carbunar
Providing reliable and surreptitious communications is difficult in the presence of adaptive and resourceful state level censors. In this paper we introduce Tithonus, a framework that builds on the Bitcoin blockchain and network to provide censorship-resistant communication mechanisms. In contrast to previous approaches, we do not rely solely on the slow and expensive blockchain consensus mechanism but instead fully exploit Bitcoin's peer-to-peer gossip protocol. We develop adaptive, fast and cost effective data communication solutions that camouflage client requests into inconspicuous Bitcoin transactions. We propose solutions to securely request and transfer content, with unobservability and censorship resistance, and free, pay-per-access and subscription based payment options. When compared to state-of-the-art Bitcoin writing solutions, Tithonus reduces the cost of transferring data to censored clients by 2 orders of magnitude and increases the goodput by 3 to 5 orders of magnitude. We show that Tithonus client initiated transactions are hard to detect, while server initiated transactions cannot be censored without creating split world problems to the Bitcoin blockchain.
SIJun 23, 2018
Search Rank Fraud De-Anonymization in Online SystemsMizanur Rahman, Nestor Hernandez, Bogdan Carbunar et al.
We introduce the fraud de-anonymization problem, that goes beyond fraud detection, to unmask the human masterminds responsible for posting search rank fraud in online systems. We collect and study search rank fraud data from Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters recruited from 6 crowdsourcing sites. We propose Dolos, a fraud de-anonymization system that leverages traits and behaviors extracted from these studies, to attribute detected fraud to crowdsourcing site fraudsters, thus to real identities and bank accounts. We introduce MCDense, a min-cut dense component detection algorithm to uncover groups of user accounts controlled by different fraudsters, and leverage stylometry and deep learning to attribute them to crowdsourcing site profiles. Dolos correctly identified the owners of 95% of fraudster-controlled communities, and uncovered fraudsters who promoted as many as 97.5% of fraud apps we collected from Google Play. When evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6 months, Dolos identified 1,056 apps with suspicious reviewer groups. We report orthogonal evidence of their fraud, including fraud duplicates and fraud re-posts.
CRDec 7, 2017
A Secure Mobile Authentication Alternative to BiometricsMozhgan Azimpourkivi, Umut Topkara, Bogdan Carbunar
Biometrics are widely used for authentication in consumer devices and business settings as they provide sufficiently strong security, instant verification and convenience for users. However, biometrics are hard to keep secret, stolen biometrics pose lifelong security risks to users as they cannot be reset and re-issued, and transactions authenticated by biometrics across different systems are linkable and traceable back to the individual identity. In addition, their cost-benefit analysis does not include personal implications to users, who are least prepared for the imminent negative outcomes, and are not often given equally convenient alternative authentication options. We introduce ai.lock, a secret image based authentication method for mobile devices which uses an imaging sensor to reliably extract authentication credentials similar to biometrics. Despite lacking the regularities of biometric image features, we show that ai.lock consistently extracts features across authentication attempts from general user captured images, to reconstruct credentials that can match and exceed the security of biometrics (EER = 0.71%). ai.lock only stores a hash of the object's image. We measure the security of ai.lock against brute force attacks on more than 3.5 billion authentication instances built from more than 250,000 images of real objects, and 100,000 synthetically generated images using a generative adversarial network trained on object images. We show that the ai.lock Shannon entropy is superior to a fingerprint based authentication built into popular mobile devices.
CROct 20, 2017
Camera Based Two Factor Authentication Through Mobile and Wearable DevicesMozhgan Azimpourkivi, Umut Topkara, Bogdan Carbunar
We introduce Pixie, a novel, camera based two factor authentication solution for mobile and wearable devices. A quick and familiar user action of snapping a photo is sufficient for Pixie to simultaneously perform a graphical password authentication and a physical token based authentication, yet it does not require any expensive, uncommon hardware. Pixie establishes trust based on both the knowledge and possession of an arbitrary physical object readily accessible to the user, called trinket. Users choose their trinkets similar to setting a password, and authenticate by presenting the same trinket to the camera. The fact that the object is the trinket, is secret to the user. Pixie extracts robust, novel features from trinket images, and leverages a supervised learning classifier to effectively address inconsistencies between images of the same trinket captured in different circumstances. Pixie achieved a false accept rate below 0.09% in a brute force attack with 14.3 million authentication attempts, generated with 40,000 trinket images that we captured and collected from public datasets. We identify master images, that match multiple trinkets, and study techniques to reduce their impact. In a user study with 42 participants over 8 days in 3 sessions we found that Pixie outperforms text based passwords on memorability, speed, and user preference. Furthermore, Pixie was easily discoverable by new users and accurate under field use. Users were able to remember their trinkets 2 and 7 days after registering them, without any practice between the 3 test dates.
SIJun 5, 2017
Stateless Puzzles for Real Time Online Fraud PreemptionMizanur Rahman, Ruben Recabarren, Bogdan Carbunar et al.
The profitability of fraud in online systems such as app markets and social networks marks the failure of existing defense mechanisms. In this paper, we propose FraudSys, a real-time fraud preemption approach that imposes Bitcoin-inspired computational puzzles on the devices that post online system activities, such as reviews and likes. We introduce and leverage several novel concepts that include (i) stateless, verifiable computational puzzles, that impose minimal performance overhead, but enable the efficient verification of their authenticity, (ii) a real-time, graph-based solution to assign fraud scores to user activities, and (iii) mechanisms to dynamically adjust puzzle difficulty levels based on fraud scores and the computational capabilities of devices. FraudSys does not alter the experience of users in online systems, but delays fraudulent actions and consumes significant computational resources of the fraudsters. Using real datasets from Google Play and Facebook, we demonstrate the feasibility of FraudSys by showing that the devices of honest users are minimally impacted, while fraudster controlled devices receive daily computational penalties of up to 3,079 hours. In addition, we show that with FraudSys, fraud does not pay off, as a user equipped with mining hardware (e.g., AntMiner S7) will earn less than half through fraud than from honest Bitcoin mining.
CRApr 6, 2017
Video Liveness for Citizen Journalism: Attacks and DefensesMahmudur Rahman, Mozhgan Azimpourkivi, Umut Topkara et al.
The impact of citizen journalism raises important video integrity and credibility issues. In this article, we introduce Vamos, the first user transparent video "liveness" verification solution based on video motion, that accommodates the full range of camera movements, and supports videos of arbitrary length. Vamos uses the agreement between video motion and camera movement to corroborate the video authenticity. Vamos can be integrated into any mobile video capture application without requiring special user training. We develop novel attacks that target liveness verification solutions. The attacks leverage both fully automated algorithms and trained human experts. We introduce the concept of video motion categories to annotate the camera and user motion characteristics of arbitrary videos. We show that the performance of Vamos depends on the video motion category. Even though Vamos uses motion as a basis for verification, we observe a surprising and seemingly counter-intuitive resilience against attacks performed on relatively "stationary" video chunks, which turn out to contain hard-to-imitate involuntary movements. We show that overall the accuracy of Vamos on the task of verifying whole length videos exceeds 93\% against the new attacks.
CRMar 24, 2017
Secure Management of Low Power Fitness TrackersMahmudur Rahman, Bogdan Carbunar, Umut Topkara
The increasing popular interest in personal telemetry, also called the Quantified Self or "lifelogging", has induced a popularity surge for wearable personal fitness trackers. Fitness trackers automatically collect sensor data about the user throughout the day, and integrate it into social network accounts. Solution providers have to strike a balance between many constraints, leading to a design process that often puts security in the back seat. Case in point, we reverse engineered and identified security vulnerabilities in Fitbit Ultra and Gammon Forerunner 610, two popular and representative fitness tracker products. We introduce FitBite and GarMax, tools to launch efficient attacks against Fitbit and Garmin. We devise SensCrypt, a protocol for secure data storage and communication, for use by makers of affordable and lightweight personal trackers. SensCrypt thwarts not only the attacks we introduced, but also defends against powerful JTAG Read attacks. We have built Sens.io, an Arduino Uno based tracker platform, of similar capabilities but at a fraction of the cost of current solutions. On Sens.io, SensCrypt imposes a negligible write overhead and significantly reduces the end-to-end sync overhead of Fitbit and Garmin.
CRMar 20, 2017
Hardening Stratum, the Bitcoin Pool Mining ProtocolRuben Recabarren, Bogdan Carbunar
Stratum, the de-facto mining communication protocol used by blockchain based cryptocurrency systems, enables miners to reliably and efficiently fetch jobs from mining pool servers. In this paper we exploit Stratum's lack of encryption to develop passive and active attacks on Bitcoin's mining protocol, with important implications on the privacy, security and even safety of mining equipment owners. We introduce StraTap and ISP Log attacks, that infer miner earnings if given access to miner communications, or even their logs. We develop BiteCoin, an active attack that hijacks shares submitted by miners, and their associated payouts. We build BiteCoin on WireGhost, a tool we developed to hijack and surreptitiously maintain Stratum connections. Our attacks reveal that securing Stratum through pervasive encryption is not only undesirable (due to large overheads), but also ineffective: an adversary can predict miner earnings even when given access to only packet timestamps. Instead, we devise Bedrock, a minimalistic Stratum extension that protects the privacy and security of mining participants. We introduce and leverage the mining cookie concept, a secret that each miner shares with the pool and includes in its puzzle computations, and that prevents attackers from reconstructing or hijacking the puzzles. We have implemented our attacks and collected 138MB of Stratum protocol traffic from mining equipment in the US and Venezuela. We show that Bedrock is resilient to active attacks even when an adversary breaks the crypto constructs it uses. Bedrock imposes a daily overhead of 12.03s on a single pool server that handles mining traffic from 16,000 miners.
SIMar 6, 2017
FairPlay: Fraud and Malware Detection in Google PlayMahmudur Rahman, Mizanur Rahman, Bogdan Carbunar et al.
Fraudulent behaviors in Google Android app market fuel search rank abuse and malware proliferation. We present FairPlay, a novel system that uncovers both malware and search rank fraud apps, by picking out trails that fraudsters leave behind. To identify suspicious apps, FairPlay PCF algorithm correlates review activities and uniquely combines detected review relations with linguistic and behavioral signals gleaned from longitudinal Google Play app data. We contribute a new longitudinal app dataset to the community, which consists of over 87K apps, 2.9M reviews, and 2.4M reviewers, collected over half a year. FairPlay achieves over 95% accuracy in classifying gold standard datasets of malware, fraudulent and legitimate apps. We show that 75% of the identified malware apps engage in search rank fraud. FairPlay discovers hundreds of fraudulent apps that currently evade Google Bouncer detection technology, and reveals a new type of attack campaign, where users are harassed into writing positive reviews, and install and review other apps.
CRApr 20, 2013
Fit and Vulnerable: Attacks and Defenses for a Health Monitoring DeviceMahmudur Rahman, Bogdan Carbunar, Madhusudan Banik
The fusion of social networks and wearable sensors is becoming increasingly popular, with systems like Fitbit automating the process of reporting and sharing user fitness data. In this paper we show that while compelling, the integration of health data into social networks is fraught with privacy and security vulnerabilities. Case in point, by reverse engineering the communication protocol, storage details and operation codes, we identified several vulnerabilities in Fitbit. We have built FitBite, a suite of tools that exploit these vulnerabilities to launch a wide range of attacks against Fitbit. Besides eavesdropping, injection and denial of service, several attacks can lead to rewards and financial gains. We have built FitLock, a lightweight defense system that protects Fitbit while imposing only a small overhead. Our experiments on BeagleBoard and Xperia devices show that FitLock's end-to-end overhead over Fitbit is only 2.4%.
CRApr 12, 2013
Eat the Cake and Have It Too: Privacy Preserving Location Aggregates in Geosocial NetworksBogdan Carbunar, Mahmudur Rahman, Jaime Ballesteros et al.
Geosocial networks are online social networks centered on the locations of subscribers and businesses. Providing input to targeted advertising, profiling social network users becomes an important source of revenue. Its natural reliance on personal information introduces a trade-off between user privacy and incentives of participation for businesses and geosocial network providers. In this paper we introduce location centric profiles (LCPs), aggregates built over the profiles of users present at a given location. We introduce PROFILR, a suite of mechanisms that construct LCPs in a private and correct manner. We introduce iSafe, a novel, context aware public safety application built on PROFILR . Our Android and browser plugin implementations show that PROFILR is efficient: the end-to-end overhead is small even under strong correctness assurances.