Rishab Nithyanand

CR
12papers
239citations
Novelty47%
AI Score52

12 Papers

CRApr 2
Towards Multi-Stakeholder Vulnerability Notifications in the Ad-Tech Supply Chain

Yash Vekaria, Rishab Nithyanand, Zubair Shafiq

Online advertising relies on a complex and opaque supply chain that involves multiple stakeholders, including advertisers, publishers, and ad-networks, each with distinct and sometimes conflicting incentives. Recent research has demonstrated the existence of ad-tech supply chain vulnerabilities such as dark pooling, where low-quality publishers bundle their ad inventory with higher-quality ones to mislead advertisers. We investigate the effectiveness of vulnerability notification campaigns aimed at mitigating dark pooling. Prior research on vulnerability notifications have primarily explored single-stakeholder contexts, leaving multi-stakeholder scenarios understudied. There is limited attention to complex multi-stakeholder supply chain ecosystems such as ad-tech supply chain, where resolving vulnerabilities often requires coordinated action across entities with misaligned incentives and interdependent roles. We address this gap by implementing the first online advertising supply chain vulnerability notification pipeline to systematically evaluate the responsiveness of various stakeholders in ad-tech supply chain, including publishers, ad-networks, and advertisers to vulnerability notifications by academics and activists. Our nine-month long automated multi-stakeholder notification study shows that notifications are an effective method for reducing dark pooling vulnerabilities in the online advertising ecosystem, especially when targeted towards ad-networks. Further, the sender reputation does not impact responses to notifications from activists and academics in a statistically different way. Overall, our research fosters industry-scale solution to combat ad inventory fraud and fosters future research on feasibility of multi-stakeholder vulnerability notifications in other supply chain ecosystems.

CLMar 27
I Want to Believe (but the Vocabulary Changed): Measuring the Semantic Structure and Evolution of Conspiracy Theories

Manisha Keim, Sarmad Chandio, Osama Khalid et al.

Research on conspiracy theories has largely focused on belief formation, exposure, and diffusion, while paying less attention to how their meanings change over time. This gap persists partly because conspiracy-related terms are often treated as stable lexical markers, making it difficult to separate genuine semantic changes from surface-level vocabulary changes. In this paper, we measure the semantic structure and evolution of conspiracy theories in online political discourse. Using 169.9M comments from Reddit's r/politics subreddit spanning 2012--2022, we first demonstrate that conspiracy-related language forms coherent and semantically distinguishable regions of language space, allowing conspiracy theories to be treated as semantic objects. We then track how these objects evolve over time using aligned word embeddings, enabling comparisons of semantic neighborhoods across periods. Our analysis reveals that conspiracy theories evolve non-uniformly, exhibiting patterns of semantic stability, expansion, contraction, and replacement that are not captured by keyword-based approaches alone.

CRMar 4
On the Suitability of LLM-Driven Agents for Dark Pattern Audits

Chen Sun, Yash Vekaria, Rishab Nithyanand

As LLM-driven agents begin to autonomously navigate the web, their ability to interpret and respond to manipulative interface design becomes critical. A fundamental question that emerges is: can such agents reliably recognize patterns of friction, misdirection, and coercion in interface design (i.e., dark patterns)? We study this question in a setting where the workflows are consequential: website portals associated with the submission of CCPA-related data rights requests. These portals operationalize statutory rights, but they are implemented as interactive interfaces whose design can be structured to facilitate, burden, or subtly discourage the exercise of those rights. We design and deploy an LLM-driven auditing agent capable of end-to-end traversal of rights-request workflows, structured evidence gathering, and classification of potential dark patterns. Across a set of 456 data broker websites, we evaluate: (1) the ability of the agent to consistently locate and complete request flows, (2) the reliability and reproducibility of its dark pattern classifications, and (3) the conditions under which it fails or produces poor judgments. Our findings characterize both the feasibility and the limitations of using LLM-driven agents for scalable dark pattern auditing.

CYMar 4
Turning Trust to Transactions: Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer Economy

Chen Sun, Yash Vekaria, Zubair Shafiq et al.

YouTube has evolved into a powerful platform that where creators monetize their influence through affiliate marketing, raising concerns about transparency and ethics, especially when creators fail to disclose their affiliate relationships. Although regulatory agencies like the US Federal Trade Commission (FTC) have issued guidelines to address these issues, non-compliance and consumer harm persist, and the extent of these problems remains unclear. In this paper, we introduce tools, developed with insights from recent advances in Web measurement and NLP research, to examine the state of the affiliate marketing ecosystem on YouTube. We apply these tools to a 10-year dataset of 2 million videos from nearly 540,000 creators, analyzing the prevalence of affiliate marketing on YouTube and the rates of non-compliant behavior. Our findings reveal that affiliate links are widespread, yet dis- closure compliance remains low, with most videos failing to meet FTC standards. Furthermore, we analyze the effects of different stakeholders in improving disclosure behavior. Our study suggests that the platform is highly associated with improved compliance through standardized disclosure features. We recommend that regulators and affiliate partners collaborate with platforms to enhance transparency, accountability, and trust in the influencer economy.

CLMar 9
Examining the Role of YouTube Production and Consumption Dynamics on the Formation of Extreme Ideologies

Sarmad Chandio, Rishab Nithyanand

The relationship between content production and consumption on algorithm-driven platforms like YouTube plays a critical role in shaping ideological behaviors. While prior work has largely focused on user behavior and algorithmic recommendations, the interplay between what is produced and what gets consumed, and its role in ideological shifts remains understudied. In this paper, we present a longitudinal, mixed-methods analysis combining one year of YouTube watch history with two waves of ideological surveys from 1,100 U.S. participants. We identify users who exhibited significant shifts toward more extreme ideologies and compare their content consumption and the production patterns of YouTube channels they engaged with to ideologically stable users. Our findings show that users who became more extreme consumed have different consumption habits from those who do not. This gets amplified by the fact that channels favored by users with extreme ideologies also have a higher affinity to produce content with a higher anger, grievance and other such markers. Lastly, using time series analysis, we examine whether content producers are the primary drivers of consumption behavior or merely responding to user demand.

CRJun 23, 2017
A Churn for the Better: Localizing Censorship using Network-level Path Churn and Network Tomography

Shinyoung Cho, Rishab Nithyanand, Abbas Razaghpanah et al.

Recent years have seen the Internet become a key vehicle for citizens around the globe to express political opinions and organize protests. This fact has not gone unnoticed, with countries around the world repurposing network management tools (e.g., URL filtering products) and protocols (e.g., BGP, DNS) for censorship. However, repurposing these products can have unintended international impact, which we refer to as "censorship leakage". While there have been anecdotal reports of censorship leakage, there has yet to be a systematic study of censorship leakage at a global scale. In this paper, we combine a global censorship measurement platform (ICLab) with a general-purpose technique -- boolean network tomography -- to identify which AS on a network path is performing censorship. At a high-level, our approach exploits BGP churn to narrow down the set of potential censoring ASes by over 95%. We exactly identify 65 censoring ASes and find that the anomalies introduced by 24 of the 65 censoring ASes have an impact on users located in regions outside the jurisdiction of the censoring AS, resulting in the leaking of regional censorship policies.

CLJun 6, 2017
Measuring Offensive Speech in Online Political Discourse

Rishab Nithyanand, Brian Schaffner, Phillipa Gill

The Internet and online forums such as Reddit have become an increasingly popular medium for citizens to engage in political conversations. However, the online disinhibition effect resulting from the ability to use pseudonymous identities may manifest in the form of offensive speech, consequently making political discussions more aggressive and polarizing than they already are. Such environments may result in harassment and self-censorship from its targets. In this paper, we present preliminary results from a large-scale temporal measurement aimed at quantifying offensiveness in online political discussions. To enable our measurements, we develop and evaluate an offensive speech classifier. We then use this classifier to quantify and compare offensiveness in the political and general contexts. We perform our study using a database of over 168M Reddit comments made by over 7M pseudonyms between January 2015 and January 2017 -- a period covering several divisive political events including the 2016 US presidential elections.

CRMay 17, 2016
Ad-Blocking and Counter Blocking: A Slice of the Arms Race

Rishab Nithyanand, Sheharbano Khattak, Mobin Javed et al.

Adblocking tools like Adblock Plus continue to rise in popularity, potentially threatening the dynamics of advertising revenue streams. In response, a number of publishers have ramped up efforts to develop and deploy mechanisms for detecting and/or counter-blocking adblockers (which we refer to as anti-adblockers), effectively escalating the online advertising arms race. In this paper, we develop a scalable approach for identifying third-party services shared across multiple web-sites and use it to provide a first characterization of anti-adblocking across the Alexa Top-5K websites. We map websites that perform anti-adblocking as well as the entities that provide anti-adblocking scripts. We study the modus operandi of these scripts and their impact on popular adblockers. We find that at least 6.7% of websites in the Alexa Top-5K use anti-adblocking scripts, acquired from 12 distinct entities -- some of which have a direct interest in nourishing the online advertising industry.

CRMay 11, 2016
Holding all the ASes: Identifying and Circumventing the Pitfalls of AS-aware Tor Client Design

Rishab Nithyanand, Rachee Singh, Shinyoung Cho et al.

Traffic correlation attacks to de-anonymize Tor users are possible when an adversary is in a position to observe traffic entering and exiting the Tor network. Recent work has brought attention to the threat of these attacks by network-level adversaries (e.g., Autonomous Systems). We perform a historical analysis to understand how the threat from AS-level traffic correlation attacks has evolved over the past five years. We find that despite a large number of new relays added to the Tor network, the threat has grown. This points to the importance of increasing AS-level diversity in addition to capacity of the Tor network. We identify and elaborate on common pitfalls of AS-aware Tor client design and construction. We find that succumbing to these pitfalls can negatively impact three major aspects of an AS-aware Tor client -- (1) security against AS-level adversaries, (2) security against relay-level adversaries, and (3) performance. Finally, we propose and evaluate a Tor client -- Cipollino -- which avoids these pitfalls using state-of-the-art in network-measurement. Our evaluation shows that Cipollino is able to achieve better security against network-level adversaries while maintaining security against relay-level adversaries and

CRMay 19, 2015
Measuring and mitigating AS-level adversaries against Tor

Rishab Nithyanand, Oleksii Starov, Adva Zair et al.

The popularity of Tor as an anonymity system has made it a popular target for a variety of attacks. We focus on traffic correlation attacks, which are no longer solely in the realm of academic research with recent revelations about the NSA and GCHQ actively working to implement them in practice. Our first contribution is an empirical study that allows us to gain a high fidelity snapshot of the threat of traffic correlation attacks in the wild. We find that up to 40% of all circuits created by Tor are vulnerable to attacks by traffic correlation from Autonomous System (AS)-level adversaries, 42% from colluding AS-level adversaries, and 85% from state-level adversaries. In addition, we find that in some regions (notably, China and Iran) there exist many cases where over 95% of all possible circuits are vulnerable to correlation attacks, emphasizing the need for AS-aware relay-selection. To mitigate the threat of such attacks, we build Astoria--an AS-aware Tor client. Astoria leverages recent developments in network measurement to perform path-prediction and intelligent relay selection. Astoria reduces the number of vulnerable circuits to 2% against AS-level adversaries, under 5% against colluding AS-level adversaries, and 25% against state-level adversaries. In addition, Astoria load balances across the Tor network so as to not overload any set of relays.

CRMar 19, 2015
Games Without Frontiers: Investigating Video Games as a Covert Channel

Bridger Hahn, Rishab Nithyanand, Phillipa Gill et al.

The Internet has become a critical communication infrastructure for citizens to organize protests and express dissatisfaction with their governments. This fact has not gone unnoticed, with governments clamping down on this medium via censorship, and circumvention researchers working to stay one step ahead. In this paper, we explore a promising new avenue for covert channels: real-time strategy-video games. Video games have two key features that make them attractive cover protocols for censorship circumvention. First, due to the popularity of gaming platforms such as Steam, there are a lot of different video games, each with their own protocols and server infrastructure. Users of video-game-based censorship-circumvention tools can therefore diversify across many games, making it difficult for the censor to respond by simply blocking a single cover protocol. Second, games in the same genre have many common features and concepts. As a result, the same covert channel framework can be easily adapted to work with many different games. This means that circumvention tool developers can stay ahead of the censor by creating a diverse set of tools and by quickly adapting to blockades created by the censor. We demonstrate the feasibility of this approach by implementing our coding scheme over two real-time strategy-games (including a very popular closed-source game). We evaluate the security of our system prototype -- Castle -- by quantifying its resilience to a censor-adversary, its similarity to real game traffic, and its ability to avoid common pitfalls in covert channel design. We use our prototype to demonstrate that our approach can provide throughput which is amenable to transfer of textual data, such at e-mail, SMS messages, and tweets, which are commonly used to organize political actions.

CRJan 23, 2014
New Approaches to Website Fingerprinting Defenses

Xiang Cai, Rishab Nithyanand, Rob Johnson

Website fingerprinting attacks enable an adversary to infer which website a victim is visiting, even if the victim uses an encrypting proxy, such as Tor. Previous work has shown that all proposed defenses against website fingerprinting attacks are ineffective. This paper advances the study of website fingerprinting attacks and defenses in two ways. First, we develop bounds on the trade-off between security and bandwidth overhead that any fingerprinting defense scheme can achieve. This enables us to compare schemes with different security/overhead trade-offs by comparing how close they are to the lower bound. We then refine, implement, and evaluate the Congestion Sensitive BuFLO scheme outlined by Cai, et al. CS-BuFLO, which is based on the provably-secure BuFLO defense proposed by Dyer, et al., was not fully-specified by Cai, et al, but has nonetheless attracted the attention of the Tor developers. Our experiments find that CS-BuFLO has high overhead (around 2.3-2.8x) but can get 6x closer to the bandwidth/security trade-off lower bound than Tor or plain SSH.