Yuvraj Agarwal

HC
h-index44
19papers
501citations
Novelty40%
AI Score46

19 Papers

NINov 27, 2018
Pible: Battery-Free Mote for Perpetual Indoor BLE Applications

Francesco Fraternali, Bharathan Balaji, Yuvraj Agarwal et al.

Smart building applications require a large-scale deployment of sensors distributed across the environment. Recent innovations in smart environments are driven by wireless networked sensors as they are easy to deploy. However, replacing these batteries at scale is a non-trivial, labor-intensive task. Energy harvesting has emerged as a potential solution to avoid battery replacement but requires compromises such as application specific design, simplified communication protocol or reduced quality of service. We explore the design space of battery-free sensor nodes using commercial off the shelf components, and present Pible: a Perpetual Indoor BLE sensor node that leverages ambient light and can support numerous smart building applications. We analyze node-lifetime, quality of service and light availability trade-offs and present a predictive algorithm that adapts to changing lighting conditions to maximize node lifetime and application quality of service. Using a 20 node, 15-day deployment in a real building under varying lighting conditions, we show feasible applications that can be implemented using Pible and the boundary conditions under which they can fail.

SYJan 27, 2016
Quiver: Using Control Perturbations to Increase the Observability of Sensor Data in Smart Buildings

Jason Koh, Bharathan Balaji, Vahideh Akhlaghi et al.

Modern buildings consist of hundreds of sensors and actuators for monitoring and operation of systems such as HVAC, light and security. To enable portable applications in next generation smart buildings, we need models and standardized ontologies that represent these sensors across diverse types of buildings. Recent research has shown that extracting information such as sensor type with available metadata and timeseries data analysis is difficult due to heterogeneity of systems and lack of support for interoperability. We propose perturbations in the control system as a mechanism to increase the observability of building systems to extract contextual information and develop standardized models. We design Quiver, an experimental framework for actuation of building HVAC system that enables us to perturb the control system safely. Using Quiver, we demonstrate three applications using empirical experiments on a real commercial building: colocation of data points, identification of point type and mapping of dependency between actuators. Our results show that we can colocate data points in HVAC terminal units with 98.4 % accuracy and 63 % coverage. We can identify point types of the terminal units with 85.3 % accuracy. Finally, we map the dependency links between actuators with an accuracy of 73.5 %, with 8.1 % and 18.4 % false positives and false negatives respectively.

SYMay 5, 2017
A Systematic Approach for Exploring Tradeoffs in Predictive HVAC Control Systems for Buildings

Joshua Gluck, Christian Koehler, Jennifer Mankoff et al.

Heating, Ventilation, and Cooling (HVAC) systems are often the most significant contributor to the energy usage, and the operational cost, of large office buildings. Therefore, to understand the various factors affecting the energy usage, and to optimize the operational efficiency of building HVAC systems, energy analysts and architects often create simulations (e.g., EnergyPlus or DOE-2), of buildings prior to construction or renovation to determine energy savings and quantify the Return-on-Investment (ROI). While useful, these simulations usually use static HVAC control strategies such as lowering room temperature at night, or reactive control based on simulated room occupancy. Recently, advances have been made in HVAC control algorithms that predict room occupancy. However, these algorithms depend on costly sensor installations and the tradeoffs between predictive accuracy, energy savings, comfort and expenses are not well understood. Current simulation frameworks do not support easy analysis of these tradeoffs. Our contribution is a simulation framework that can be used to explore this design space by generating objective estimates of the energy savings and occupant comfort for different levels of HVAC prediction and control performance. We validate our framework on a real-world occupancy dataset spanning 6 months for 235 rooms in a large university office building. Using the gold standard of energy use modeling and simulation (Revit and Energy Plus), we compare the energy consumption and occupant comfort in 29 independent simulations that explore our parameter space. Our results highlight a number of potentially useful tradeoffs with respect to energy savings, comfort, and algorithmic performance among predictive, reactive, and static schedules, for a stakeholder of our building.

CLNov 11, 2025
SpiderGen: Towards Procedure Generation For Carbon Life Cycle Assessments with Generative AI

Anupama Sitaraman, Bharathan Balaji, Yuvraj Agarwal

Investigating the effects of climate change and global warming caused by GHG emissions have been a key concern worldwide. These emissions are largely contributed to by the production, use and disposal of consumer products. Thus, it is important to build tools to estimate the environmental impact of consumer goods, an essential part of which is conducting Life Cycle Assessments (LCAs). LCAs specify and account for the appropriate processes involved with the production, use, and disposal of the products. We present SpiderGen, an LLM-based workflow which integrates the taxonomy and methodology of traditional LCA with the reasoning capabilities and world knowledge of LLMs to generate graphical representations of the key procedural information used for LCA, known as Product Category Rules Process Flow Graphs (PCR PFGs). We additionally evaluate the output of SpiderGen by comparing it with 65 real-world LCA documents. We find that SpiderGen provides accurate LCA process information that is either fully correct or has minor errors, achieving an F1-Score of 65% across 10 sample data points, as compared to 53% using a one-shot prompting method. We observe that the remaining errors occur primarily due to differences in detail between LCA documents, as well as differences in the "scope" of which auxiliary processes must also be included. We also demonstrate that SpiderGen performs better than several baselines techniques, such as chain-of-thought prompting and one-shot prompting. Finally, we highlight SpiderGen's potential to reduce the human effort and costs for estimating carbon impact, as it is able to produce LCA process information for less than \$1 USD in under 10 minutes as compared to the status quo LCA, which can cost over \$25000 USD and take up to 21-person days.

HCMay 18
OrganicHAR: Towards Activity Discovery in Organic Settings for Privacy Preserving Sensors Using Efficient Video Analysis

Prasoon Patidar, Riku Arakawa, Ricardo Graça et al.

Deploying human activity recognition (HAR) at home is still rare because sensor signals vary wildly across houses, people, and time, essentially requiring in-situ data collection and training. Prior approaches use cameras to generate training labels for privacy-preserving sensors (LiDAR, RADAR, Thermal), but this forces sensors to detect predefined activities that cameras can see yet the sensors themselves cannot reliably distinguish. In this work, we introduce OrganicHAR, an activity discovery framework that inverts this relationship by placing sensor capabilities at the center of activity discovery. Our approach identifies naturally occurring signal patterns using privacy-preserving sensors, leverages Vision Language Models (VLMs) only during these key moments for scene understanding, and discovers discrete activity labels at granularities that these sensors can reliably detect. Our evaluation with 12 participants demonstrates OrganicHAR's effectiveness: it achieves 79% accuracy for coarse (4-5) activities using only basic ambient sensors (radar, lidar, thermal arrays), and 73% accuracy for fine-grained (8-9) activities when a wearable IMU, depth, and pose sensor are added. OrganicHAR maintains 77% accuracy on average across configurations while discovering 4-8 categories per user (15 across all users) tailored to each environment and sensor capabilities. By triggering video processing only at key moments identified by local sensors, we reduce queries to VLM by 90%, enabling practical and privacy-preserving activity recognition in natural settings.

HCAug 3, 2025
IMUCoCo: Enabling Flexible On-Body IMU Placement for Human Pose Estimation and Activity Recognition

Haozhe Zhou, Riku Arakawa, Yuvraj Agarwal et al.

IMUs are regularly used to sense human motion, recognize activities, and estimate full-body pose. Users are typically required to place sensors in predefined locations that are often dictated by common wearable form factors and the machine learning model's training process. Consequently, despite the increasing number of everyday devices equipped with IMUs, the limited adaptability has seriously constrained the user experience to only using a few well-explored device placements (e.g., wrist and ears). In this paper, we rethink IMU-based motion sensing by acknowledging that signals can be captured from any point on the human body. We introduce IMU over Continuous Coordinates (IMUCoCo), a novel framework that maps signals from a variable number of IMUs placed on the body surface into a unified feature space based on their spatial coordinates. These features can be plugged into downstream models for pose estimation and activity recognition. Our evaluations demonstrate that IMUCoCo supports accurate pose estimation in a wide range of typical and atypical sensor placements. Overall, IMUCoCo supports significantly more flexible use of IMUs for motion sensing than the state-of-the-art, allowing users to place their sensors-laden devices according to their needs and preferences. The framework also supports the ability to change device locations depending on the context and suggests placement depending on the use case.

CRJun 2, 2024
VeriSplit: Secure and Practical Offloading of Machine Learning Inferences across IoT Devices

Han Zhang, Zifan Wang, Mihir Dhamankar et al.

Many Internet-of-Things (IoT) devices rely on cloud computation resources to perform machine learning inferences. This is expensive and may raise privacy concerns for users. Consumers of these devices often have hardware such as gaming consoles and PCs with graphics accelerators that are capable of performing these computations, which may be left idle for significant periods of time. While this presents a compelling potential alternative to cloud offloading, concerns about the integrity of inferences, the confidentiality of model parameters, and the privacy of users' data mean that device vendors may be hesitant to offload their inferences to a platform managed by another manufacturer. We propose VeriSplit, a framework for offloading machine learning inferences to locally-available devices that address these concerns. We introduce masking techniques to protect data privacy and model confidentiality, and a commitment-based verification protocol to address integrity. Unlike much prior work aimed at addressing these issues, our approach does not rely on computation over finite field elements, which may interfere with floating-point computation supports on hardware accelerators and require modification to existing models. We implemented a prototype of VeriSplit and our evaluation results show that, compared to performing computation locally, our secure and private offloading solution can reduce inference latency by 28%--83%.

CRDec 28, 2021
Analysis of Longitudinal Changes in Privacy Behavior of Android Applications

Alexander Yu, Yuvraj Agarwal, Jason I. Hong

Privacy concerns have long been expressed around smart devices, and the concerns around Android apps have been studied by many past works. Over the past 10 years, we have crawled and scraped data for almost 1.9 million apps, and also stored the APKs for 135,536 of them. In this paper, we examine the trends in how Android apps have changed over time with respect to privacy and look at it from two perspectives: (1) how privacy behavior in apps have changed as they are updated over time, (2) how these changes can be accounted for when comparing third-party libraries and the app's own internals. To study this, we examine the adoption of HTTPS, whether apps scan the device for other installed apps, the use of permissions for privacy-sensitive data, and the use of unique identifiers. We find that privacy-related behavior has improved with time as apps continue to receive updates, and that the third-party libraries used by apps are responsible for more issues with privacy. However, we observe that in the current state of Android apps, there has not been enough of an improvement in terms of privacy and many issues still need to be addressed.

CRApr 24, 2021
The Design of the User Interfaces for Privacy Enhancements for Android

Jason I. Hong, Yuvraj Agarwal, Matt Fredrikson et al.

We present the design and design rationale for the user interfaces for Privacy Enhancements for Android (PE for Android). These UIs are built around two core ideas, namely that developers should explicitly declare the purpose of why sensitive data is being used, and these permission-purpose pairs should be split by first party and third party uses. We also present a taxonomy of purposes and ways of how these ideas can be deployed in the existing Android ecosystem.

HCDec 22, 2020
What Makes People Install a COVID-19 Contact-Tracing App? Understanding the Influence of App Design and Individual Difference on Contact-Tracing App Adoption Intention

Tianshi Li, Camille Cobb, Jackie et al.

Smartphone-based contact-tracing apps are a promising solution to help scale up the conventional contact-tracing process. However, low adoption rates have become a major issue that prevents these apps from achieving their full potential. In this paper, we present a national-scale survey experiment ($N = 1963$) in the U.S. to investigate the effects of app design choices and individual differences on COVID-19 contact-tracing app adoption intentions. We found that individual differences such as prosocialness, COVID-19 risk perceptions, general privacy concerns, technology readiness, and demographic factors played a more important role than app design choices such as decentralized design vs. centralized design, location use, app providers, and the presentation of security risks. Certain app designs could exacerbate the different preferences in different sub-populations which may lead to an inequality of acceptance to certain app design choices (e.g., developed by state health authorities vs. a large tech company) among different groups of people (e.g., people living in rural areas vs. people living in urban areas). Our mediation analysis showed that one's perception of the public health benefits offered by the app and the adoption willingness of other people had a larger effect in explaining the observed effects of app design choices and individual differences than one's perception of the app's security and privacy risks. With these findings, we discuss practical implications on the design, marketing, and deployment of COVID-19 contact-tracing apps in the U.S.

HCMay 25, 2020
Decentralized is not risk-free: Understanding public perceptions of privacy-utility trade-offs in COVID-19 contact-tracing apps

Tianshi Li, Jackie, Yang et al.

Contact-tracing apps have potential benefits in helping health authorities to act swiftly to halt the spread of COVID-19. However, their effectiveness is heavily dependent on their installation rate, which may be influenced by people's perceptions of the utility of these apps and any potential privacy risks due to the collection and releasing of sensitive user data (e.g., user identity and location). In this paper, we present a survey study that examined people's willingness to install six different contact-tracing apps after informing them of the risks and benefits of each design option (with a U.S.-only sample on Amazon Mechanical Turk, $N=208$). The six app designs covered two major design dimensions (centralized vs decentralized, basic contact tracing vs. also providing hotspot information), grounded in our analysis of existing contact-tracing app proposals. Contrary to assumptions of some prior work, we found that the majority of people in our sample preferred to install apps that use a centralized server for contact tracing, as they are more willing to allow a centralized authority to access the identity of app users rather than allowing tech-savvy users to infer the identity of diagnosed users. We also found that the majority of our sample preferred to install apps that share diagnosed users' recent locations in public places to show hotspots of infection. Our results suggest that apps using a centralized architecture with strong security protection to do basic contact tracing and providing users with other useful information such as hotspots of infection in public places may achieve a high adoption rate in the U.S.

CYFeb 11, 2020
Ask the Experts: What Should Be on an IoT Privacy and Security Label?

Pardis Emami-Naeini, Yuvraj Agarwal, Lorrie Faith Cranor et al.

Information about the privacy and security of Internet of Things (IoT) devices is not readily available to consumers who want to consider it before making purchase decisions. While legislators have proposed adding succinct, consumer accessible, labels, they do not provide guidance on the content of these labels. In this paper, we report on the results of a series of interviews and surveys with privacy and security experts, as well as consumers, where we explore and test the design space of the content to include on an IoT privacy and security label. We conduct an expert elicitation study by following a three-round Delphi process with 22 privacy and security experts to identify the factors that experts believed are important for consumers when comparing the privacy and security of IoT devices to inform their purchase decisions. Based on how critical experts believed each factor is in conveying risk to consumers, we distributed these factors across two layers---a primary layer to display on the product package itself or prominently on a website, and a secondary layer available online through a web link or a QR code. We report on the experts' rationale and arguments used to support their choice of factors. Moreover, to study how consumers would perceive the privacy and security information specified by experts, we conducted a series of semi-structured interviews with 15 participants, who had purchased at least one IoT device (smart home device or wearable). Based on the results of our expert elicitation and consumer studies, we propose a prototype privacy and security label to help consumers make more informed IoT-related purchase decisions.

HCDec 28, 2019
Real World Longitudinal iOS App Usage Study at Scale

Dohyun Kim, Joshua Gluck, Malcolm Hall et al.

Given the importance of understanding the interaction between mobile devices and their users, app usage patterns have been studied in various contexts. However, prior work has not fully investigated longitudinal changes to app usage behavior. In this paper, we present a longitudinal, large-scale study of mobile app usage based on a dataset collected from 162,006 iPhones and iPads over 4 years. We explore multiple dimensions of app usage pattern proving useful insights on how app usage changes over time. Our key findings include (i) app usage pattern changes over time both at the individual app level and the app category level (i.e. proportion of time a user spends using an app), (ii) users keep a small set of apps frequently launched (90% of iPhone users launch roughly 14-18 apps weekly), (iii) a small number of apps remain popular while some specific kinds of apps (e.g. Games) have a shorter life cycle compared to other apps of different categories. Finally, we discuss our findings and their implications, for example, a short-term study as an attempt to understand the general needs of mobile devices may not achieve useful results for the long term.

SYSep 4, 2019
ACES -- Automatic Configuration of Energy Harvesting Sensors with Reinforcement Learning

Francesco Fraternali, Bharathan Balaji, Yuvraj Agarwal et al.

Internet of Things forms the backbone of modern building applications. Wireless sensors are being increasingly adopted for their flexibility and reduced cost of deployment. However, most wireless sensors are powered by batteries today and large deployments are inhibited by manual battery replacement. Energy harvesting sensors provide an attractive alternative, but they need to provide adequate quality of service to applications given uncertain energy availability. We propose using reinforcement learning to optimize the operation of energy harvesting sensors to maximize sensing quality with available energy. We present our system ACES that uses reinforcement learning for periodic and event-driven sensing indoors with ambient light energy harvesting. Our custom-built board uses a supercapacitor to store energy temporarily, senses light, motion events and relays them using Bluetooth Low Energy. Using simulations and real deployments, we show that our sensor nodes adapt to their lighting conditions and continuously sends measurements and events across nights and weekends. We use deployment data to continually adapt sensing to changing environmental patterns and transfer learning to reduce the training time in real deployments. In our 60 node deployment lasting two weeks, we observe a dead time of 0.1%. The periodic sensors that measure luminosity have a mean sampling period of 90 seconds and the event sensors that detect motion with PIR captured 86% of the events on average compared to a battery-powered node.

IRAug 6, 2018
Automated Extraction of Personal Knowledge from Smartphone Push Notifications

Yuanchun Li, Ziyue Yang, Yao Guo et al.

Personalized services are in need of a rich and powerful personal knowledge base, i.e. a knowledge base containing information about the user. This paper proposes an approach to extracting personal knowledge from smartphone push notifications, which are used by mobile systems and apps to inform users of a rich range of information. Our solution is based on the insight that most notifications are formatted using templates, while knowledge entities can be usually found within the parameters to the templates. As defining all the notification templates and their semantic rules are impractical due to the huge number of notification templates used by potentially millions of apps, we propose an automated approach for personal knowledge extraction from push notifications. We first discover notification templates through pattern mining, then use machine learning to understand the template semantics. Based on the templates and their semantics, we are able to translate notification text into knowledge facts automatically. Users' privacy is preserved as we only need to upload the templates to the server for model training, which do not contain any personal information. According to our experiments with about 120 million push notifications from 100,000 smartphone users, our system is able to extract personal knowledge accurately and efficiently.

MLSep 7, 2017
Transfer Learning for Performance Modeling of Configurable Systems: An Exploratory Analysis

Pooyan Jamshidi, Norbert Siegmund, Miguel Velez et al.

Modern software systems provide many configuration options which significantly influence their non-functional properties. To understand and predict the effect of configuration options, several sampling and learning strategies have been proposed, albeit often with significant cost to cover the highly dimensional configuration space. Recently, transfer learning has been applied to reduce the effort of constructing performance models by transferring knowledge about performance behavior across environments. While this line of research is promising to learn more accurate models at a lower cost, it is unclear why and when transfer learning works for performance modeling. To shed light on when it is beneficial to apply transfer learning, we conducted an empirical study on four popular software systems, varying software configurations and environmental conditions, such as hardware, workload, and software versions, to identify the key knowledge pieces that can be exploited for transfer learning. Our results show that in small environmental changes (e.g., homogeneous workload change), by applying a linear transformation to the performance model, we can understand the performance behavior of the target environment, while for severe environmental changes (e.g., drastic workload change) we can transfer only knowledge that makes sampling more efficient, e.g., by reducing the dimensionality of the configuration space.

CRAug 21, 2017
PrivacyProxy: Leveraging Crowdsourcing and In Situ Traffic Analysis to Detect and Mitigate Information Leakage

Gaurav Srivastava, Kunal Bhuwalka, Swarup Kumar Sahoo et al.

Many smartphone apps transmit personally identifiable information (PII), often without the users knowledge. To address this issue, we present PrivacyProxy, a system that monitors outbound network traffic and generates app-specific signatures to represent sensitive data being shared. PrivacyProxy uses a crowd-based approach to detect likely PII in an adaptive and scalable manner by anonymously combining signatures from different users of the same app. Furthermore, we do not observe users network traffic and instead rely on hashed signatures. We present the design and implementation of PrivacyProxy and evaluate it with a lab study, a field deployment, a user survey, and a comparison against prior work. Our field study shows PrivacyProxy can automatically detect PII with an F1 score of 0.885. PrivacyProxy also achieves an F1 score of 0.759 in our controlled experiment for the 500 most popular apps. The F1 score also improves to 0.866 with additional training data for 40 apps that initially had the most false positives. We also show performance overhead of using PrivacyProxy is between 8.6% to 14.2%, slightly more than using a standard unmodified VPN, and most users report no perceptible impact on battery life or the network.

CYDec 19, 2016
Managing Commercial HVAC Systems: What do Building Operators Really Need?

Bharathan Balaji, Nadir Weibel, Yuvraj Agarwal

Buildings form an essential part of modern life; people spend a significant amount of their time in them, and they consume large amounts of energy. A variety of systems provide services such as lighting, air conditioning and security which are managed using Building Management Systems (BMS) by building operators. To better understand the capability of current BMS and characterize common practices of building operators, we investigated their use across five institutions in the US. We interviewed ten operators and discovered that BMS do not address a number of key concerns for the management of buildings. Our analysis is rooted in the everyday work of building operators and highlights a number of design suggestions to help improve the user experience and management of BMS, ultimately leading to improvements in productivity, as well as buildings comfort and energy efficiency.

HCJan 26, 2016
Genie: A Longitudinal Study Comparing Physical and Software-augmented Thermostats in Office Buildings

Bharathan Balaji, Jason Koh, Nadir Weibel et al.

Thermostats are primary interfaces for occupants of office buildings to express their comfort preferences. However, standard thermostats are often ineffective due to inaccessibility, lack of information, or limited responsiveness, leading to occupant discomfort. Software thermostats based on web or smartphone applications provide alternative interfaces to occupants with minimal deployment cost. However, their usage and effectiveness have not been studied extensively in real settings. In this paper we present Genie, a novel software-augmented thermostat that we deployed and studied at our university over a period of 21 months. Our data shows that providing wider thermal control to users does not lead to system abuse and that the effect on energy consumption is minimal while improving comfort and energy awareness. We believe that increased introduction of software thermostats in office buildings will have important effects on comfort and energy consumption and we provide key design recommendations for their implementation and deployment.