LGJul 20, 2023
Beyond Black-Box Advice: Learning-Augmented Algorithms for MDPs with Q-Value PredictionsTongxin Li, Yiheng Lin, Shaolei Ren et al.
We study the tradeoff between consistency and robustness in the context of a single-trajectory time-varying Markov Decision Process (MDP) with untrusted machine-learned advice. Our work departs from the typical approach of treating advice as coming from black-box sources by instead considering a setting where additional information about how the advice is generated is available. We prove a first-of-its-kind consistency and robustness tradeoff given Q-value advice under a general MDP model that includes both continuous and discrete state/action spaces. Our results highlight that utilizing Q-value advice enables dynamic pursuit of the better of machine-learned advice and a robust baseline, thus result in near-optimal performance guarantees, which provably improves what can be obtained solely with black-box advice.
LGJun 2, 2022
Equipping Black-Box Policies with Model-Based Advice for Stable Nonlinear ControlTongxin Li, Ruixiao Yang, Guannan Qu et al.
Machine-learned black-box policies are ubiquitous for nonlinear control problems. Meanwhile, crude model information is often available for these problems from, e.g., linear approximations of nonlinear dynamics. We study the problem of equipping a black-box control policy with model-based advice for nonlinear control on a single trajectory. We first show a general negative result that a naive convex combination of a black-box policy and a linear model-based policy can lead to instability, even if the two policies are both stabilizing. We then propose an adaptive $λ$-confident policy, with a coefficient $λ$ indicating the confidence in a black-box policy, and prove its stability. With bounded nonlinearity, in addition, we show that the adaptive $λ$-confident policy achieves a bounded competitive ratio when a black-box policy is near-optimal. Finally, we propose an online learning approach to implement the adaptive $λ$-confident policy and verify its efficacy in case studies about the CartPole problem and a real-world electric vehicle (EV) charging problem with data bias due to COVID-19.
SYApr 7Code
Bridging Natural Language and Microgrid Dynamics: A Context-Aware Simulator and DatasetTinko Sebastian Bartels, Ruixiang Wu, Xinyu Lu et al.
Addressing the critical need for intelligent, context-aware energy management in renewable systems, we introduce the \textbf{OpenCEM Simulator and Dataset}: the first open-source digital twin explicitly designed to integrate rich, unstructured contextual information with quantitative renewable energy dynamics. Traditional energy management relies heavily on numerical time series, thereby neglecting the significant predictive power embedded in human-generated context (e.g., event schedules, system logs, user intentions). OpenCEM bridges this gap by offering a unique platform comprising both a meticulously aligned, language-rich dataset from a real-world PV-and-battery microgrid installation and a modular simulator capable of natively processing this multi-modal context. The OpenCEM Simulator provides a high-fidelity environment for developing and validating novel control algorithms and prediction models, particularly those leveraging Large Language Models. We detail its component-based architecture, hybrid data-driven and physics-based modelling capabilities, and demonstrate its utility through practical examples, including context-aware load forecasting and the implementation of online optimal battery charging control strategies. By making this platform publicly available, OpenCEM aims to accelerate research into the next generation of intelligent, sustainable, and truly context-aware energy systems.
LGNov 2, 2023
Anytime-Competitive Reinforcement Learning with Policy PriorJianyi Yang, Pengfei Li, Tongxin Li et al.
This paper studies the problem of Anytime-Competitive Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random dynamics, but the cost in a specific episode can still be unsatisfactorily high. In contrast, the goal of A-CMDP is to optimize the expected reward while guaranteeing a bounded cost in each round of any episode against a policy prior. We propose a new algorithm, called Anytime-Competitive Reinforcement Learning (ACRL), which provably guarantees the anytime cost constraints. The regret analysis shows the policy asymptotically matches the optimal reward achievable under the anytime competitive constraints. Experiments on the application of carbon-intelligent computing verify the reward performance and cost constraint guarantee of ACRL.
SYNov 10, 2023
Out-of-Distribution-Aware Electric Vehicle ChargingTongxin Li, Chenxi Sun
We tackle the challenge of learning to charge Electric Vehicles (EVs) with Out-of-Distribution (OOD) data. Traditional scheduling algorithms typically fail to balance near-optimal average performance with worst-case guarantees, particularly with OOD data. Model Predictive Control (MPC) is often too conservative and data-independent, whereas Reinforcement Learning (RL) tends to be overly aggressive and fully trusts the data, hindering their ability to consistently achieve the best-of-both-worlds. To bridge this gap, we introduce a novel OOD-aware scheduling algorithm, denoted OOD-Charging. This algorithm employs a dynamic "awareness radius", which updates in real-time based on the Temporal Difference (TD)-error that reflects the severity of OOD. The OOD-Charging algorithm allows for a more effective balance between consistency and robustness in EV charging schedules, thereby significantly enhancing adaptability and efficiency in real-world charging environments. Our results demonstrate that this approach improves the scheduling reward reliably under real OOD scenarios with remarkable shifts of EV charging behaviors caused by COVID-19 in the Caltech ACN-Data.
LGJul 2, 2024
Beyond Numeric Rewards: In-Context Dueling Bandits with LLM AgentsFanzeng Xia, Hao Liu, Yisong Yue et al.
In-Context Reinforcement Learning (ICRL) is a frontier paradigm to solve Reinforcement Learning (RL) problems in the foundation model era. While ICRL capabilities have been demonstrated in transformers through task-specific training, the potential of Large Language Models (LLMs) out-of-the-box remains largely unexplored. This paper investigates whether LLMs can generalize cross-domain to perform ICRL under the problem of Dueling Bandits (DB), a stateless preference-based RL setting. We find that the top-performing LLMs exhibit a notable zero-shot capacity for relative decision-making, which translates to low short-term weak regret across all DB environment instances by quickly including the best arm in duels. However, an optimality gap still exists between LLMs and classic DB algorithms in terms of strong regret. LLMs struggle to converge and consistently exploit even when explicitly prompted to do so, and are sensitive to prompt variations. To bridge this gap, we propose an agentic flow framework: LLM with Enhanced Algorithmic Dueling (LEAD), which integrates off-the-shelf DB algorithm support with LLM agents through fine-grained adaptive interplay. We show that LEAD has theoretical guarantees inherited from classic DB algorithms on both weak and strong regret. We validate its efficacy and robustness even with noisy and adversarial prompts. The design of such an agentic framework sheds light on how to enhance the trustworthiness of general-purpose LLMs generalized to in-context decision-making tasks.
LGAug 20, 2025Code
DualNILM: Energy Injection Identification Enabled Disaggregation with Deep Multi-Task LearningXudong Wang, Guoming Tang, Junyu Xue et al.
Non-Intrusive Load Monitoring (NILM) offers a cost-effective method to obtain fine-grained appliance-level energy consumption in smart homes and building applications. However, the increasing adoption of behind-the-meter (BTM) energy sources such as solar panels and battery storage poses new challenges for conventional NILM methods that rely solely on at-the-meter data. The energy injected from the BTM sources can obscure the power signatures of individual appliances, leading to a significant decrease in NILM performance. To address this challenge, we present DualNILM, a deep multi-task learning framework designed for the dual tasks of appliance state recognition and injected energy identification. Using a Transformer-based architecture that integrates sequence-to-point and sequence-to-sequence strategies, DualNILM effectively captures multiscale temporal dependencies in the aggregate power consumption patterns, allowing for accurate appliance state recognition and energy injection identification. Extensive evaluation on self-collected and synthesized datasets demonstrates that DualNILM maintains an excellent performance for dual tasks in NILM, much outperforming conventional methods. Our work underscores the framework's potential for robust energy disaggregation in modern energy systems with renewable penetration. Synthetic photovoltaic augmented datasets with realistic injection simulation methodology will be open-sourced after review.
LGJun 4, 2024Code
Building Socially-Equitable Public ModelsYejia Liu, Jianyi Yang, Pengfei Li et al.
Public models offer predictions to a variety of downstream tasks and have played a crucial role in various AI applications, showcasing their proficiency in accurate predictions. However, the exclusive emphasis on prediction accuracy may not align with the diverse end objectives of downstream agents. Recognizing the public model's predictions as a service, we advocate for integrating the objectives of downstream agents into the optimization process. Concretely, to address performance disparities and foster fairness among heterogeneous agents in training, we propose a novel Equitable Objective. This objective, coupled with a policy gradient algorithm, is crafted to train the public model to produce a more equitable/uniform performance distribution across downstream agents, each with their unique concerns. Both theoretical analysis and empirical case studies have proven the effectiveness of our method in advancing performance equity across diverse downstream agents utilizing the public model for their decision-making. Codes and datasets are released at https://github.com/Ren-Research/Socially-Equitable-Public-Models.
SEMay 9, 2020Code
Building and Maintaining a Third-Party Library Supply Chain for Productive and Secure SGX Enclave DevelopmentPei Wang, Yu Ding, Mingshen Sun et al.
The big data industry is facing new challenges as concerns about privacy leakage soar. One of the remedies to privacy breach incidents is to encapsulate computations over sensitive data within hardware-assisted Trusted Execution Environments (TEE). Such TEE-powered software is called secure enclaves. Secure enclaves hold various advantages against competing for privacy-preserving computation solutions. However, enclaves are much more challenging to build compared with ordinary software. The reason is that the development of TEE software must follow a restrictive programming model to make effective use of strong memory encryption and segregation enforced by hardware. These constraints transitively apply to all third-party dependencies of the software. If these dependencies do not officially support TEE hardware, TEE developers have to spend additional engineering effort in porting them. High development and maintenance cost is one of the major obstacles against adopting TEE-based privacy protection solutions in production. In this paper, we present our experience and achievements with regard to constructing and continuously maintaining a third-party library supply chain for TEE developers. In particular, we port a large collection of Rust third-party libraries into Intel SGX, one of the most mature trusted computing platforms. Our supply chain accepts upstream patches in a timely manner with SGX-specific security auditing. We have been able to maintain the SGX ports of 159 open-source Rust libraries with reasonable operational costs. Our work can effectively reduce the engineering cost of developing SGX enclaves for privacy-preserving data processing and exchange.
SYApr 8
Context-Aware Model Predictive Control for Microgrid Energy Management via LLMsRuixiang Wu, Jiahao Ai, Tinko Sebastian Bartels et al.
The optimal operation of modern microgrids, particularly those integrating stochastic renewable generation and battery energy storage system (BESS), relies heavily on load and disturbances forecasting to minimize operational costs. However, in environments with uncertainties in both generation and consumption, traditional numerical forecasting methods often fail to capture generation shifts and event-driven load surges. While contextual information regarding event schedules, system logs, and computational task records is easily obtainable, classic control paradigms lack a formal interface to integrate the unstructured, semantic data into the physical operation loop. This paper addresses this gap by introducing the InstructMPC framework, which utilizes a Large Language Model (LLM) paired with a tunable last layer mapping to translate unstructured operational context into predictive disturbance trajectories for the MPC controller. Unlike conventional forecasting methods, the proposed approach treats the last layer mapping as a tunable component, refined online based on the realized control cost. We establish a theoretical foundation for this closed-loop tuning strategy, proving a regret bound of $O(\sqrt{T \log T})$ for linear systems under a tailored task-aware loss function, together with robustness guarantees against uninformative or noisy textual inputs. The control strategy is experimentally validated on OpenCEM, a real-world microgrid with highly fluctuating generation and consumption. Experimental results demonstrate that the LLM-driven MPC significantly reduces cumulative grid electricity costs compared to classical context-agnostic baselines, validating the efficacy of integrating semantic information directly into physical control loops.
SYApr 21
Safety-Critical Contextual Control via Online Riemannian Optimization with World ModelsTongxin Li
Modern world models are becoming too complex to admit explicit dynamical descriptions. We study safety-critical contextual control, where a Planner must optimize a task objective using only feasibility samples from a black-box Simulator, conditioned on a context signal $ξ_t$. We develop a sample-based Penalized Predictive Control (PPC) framework grounded in online Riemannian optimization, in which the Simulator compresses the feasibility manifold into a score-based density $\hat{p}(u \mid ξ_t)$ that endows the action space with a Riemannian geometry guiding the Planner's gradient descent. The barrier curvature $κ(ξ_t)$, the minimum curvature of the conditional log-density $-\ln\hat{p}(\cdot\midξ_t)$, governs both convergence rate and safety margin, replacing the Lipschitz constant of the unknown dynamics. Our main result is a contextual safety bound showing that the distance from the true feasibility manifold is controlled by the score estimation error and a ratio that depends on $κ(ξ_t)$, both of which improve with richer context. Simulations on a dynamic navigation task confirm that contextual PPC substantially outperforms marginal and frozen density models, with the advantage growing after environment shifts.
HCApr 7
Understanding Educators' Perceptions of AI-generated Non-consensual Intimate ImageryTongxin Li, Katelyn M Reyes, Liezeil Jimenez et al.
AI-generated non-consensual intimate imagery (AIG-NCII) is an emerging social problem due to the advancement of AI tools. While recent incidents in middle and high schools have highlighted the urgency of this issue, there is limited understanding of what concrete supports schools need to effectively address AIG-NCII. To fill this gap, we conducted an interview study with 20 educators in the U.S. and investigated their attitudes, experiences, and practices related to AIG-NCII. Educators expressed concerns about both students' and their own vulnerability, as AIG-NCII may cause moral decline among students, while educators themselves could become victims. Nevertheless, existing practices in schools are limited, and they lack both training and systematic policies. Challenges such as a lack of resources, unclear legal boundaries, and limited knowledge of AI make implementation difficult. The findings of this paper contribute to interactive educational tool design, curriculum design, and policy-making, especially regarding the need for multi-stakeholder strategies to address issues surrounding AIG-NCII.
LGAug 4, 2025
Adaptive Riemannian Graph Neural NetworksXudong Wang, Tongxin Li, Chris Ding et al.
Graph data often exhibits complex geometric heterogeneity, where structures with varying local curvature, such as tree-like hierarchies and dense communities, coexist within a single network. Existing geometric GNNs, which embed graphs into single fixed-curvature manifolds or discrete product spaces, struggle to capture this diversity. We introduce Adaptive Riemannian Graph Neural Networks (ARGNN), a novel framework that learns a continuous and anisotropic Riemannian metric tensor field over the graph. It allows each node to determine its optimal local geometry, enabling the model to fluidly adapt to the graph's structural landscape. Our core innovation is an efficient parameterization of the node-wise metric tensor, specializing to a learnable diagonal form that captures directional geometric information while maintaining computational tractability. To ensure geometric regularity and stable training, we integrate a Ricci flow-inspired regularization that smooths the learned manifold. Theoretically, we establish the rigorous geometric evolution convergence guarantee for ARGNN and provide a continuous generalization that unifies prior fixed or mixed-curvature GNNs. Empirically, our method demonstrates superior performance on both homophilic and heterophilic benchmark datasets with the ability to capture diverse structures adaptively. Moreover, the learned geometries both offer interpretable insights into the underlying graph structure and empirically corroborate our theoretical analysis.
LGOct 21, 2025
Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen ApproachChenbei Lu, Zaiwei Chen, Tongxin Li et al.
Traditional reinforcement learning (RL) assumes the agents make decisions based on Markov decision processes (MDPs) with one-step transition models. In many real-world applications, such as energy management and stock investment, agents can access multi-step predictions of future states, which provide additional advantages for decision making. However, multi-step predictions are inherently high-dimensional: naively embedding these predictions into an MDP leads to an exponential blow-up in state space and the curse of dimensionality. Moreover, existing RL theory provides few tools to analyze prediction-augmented MDPs, as it typically works on one-step transition kernels and cannot accommodate multi-step predictions with errors or partial action-coverage. We address these challenges with three key innovations: First, we propose the \emph{Bayesian value function} to characterize the optimal prediction-aware policy tractably. Second, we develop a novel \emph{Bellman-Jensen Gap} analysis on the Bayesian value function, which enables characterizing the value of imperfect predictions. Third, we introduce BOLA (Bayesian Offline Learning with Online Adaptation), a two-stage model-based RL algorithm that separates offline Bayesian value learning from lightweight online adaptation to real-time predictions. We prove that BOLA remains sample-efficient even under imperfect predictions. We validate our theory and algorithm on synthetic MDPs and a real-world wind energy storage control problem.
DSOct 16, 2025
Prediction-Specific Design of Learning-Augmented AlgorithmsSizhe Li, Nicolas Christianson, Tongxin Li
Algorithms with predictions} has emerged as a powerful framework to combine the robustness of traditional online algorithms with the data-driven performance benefits of machine-learned (ML) predictions. However, most existing approaches in this paradigm are overly conservative, {as they do not leverage problem structure to optimize performance in a prediction-specific manner}. In this paper, we show that such prediction-specific performance criteria can enable significant performance improvements over the coarser notions of consistency and robustness considered in prior work. Specifically, we propose a notion of \emph{strongly-optimal} algorithms with predictions, which obtain Pareto optimality not just in the worst-case tradeoff between robustness and consistency, but also in the prediction-specific tradeoff between these metrics. We develop a general bi-level optimization framework that enables systematically designing strongly-optimal algorithms in a wide variety of problem settings, and we propose explicit strongly-optimal algorithms for several classic online problems: deterministic and randomized ski rental, and one-max search. Our analysis reveals new structural insights into how predictions can be optimally integrated into online algorithms by leveraging a prediction-specific design. To validate the benefits of our proposed framework, we empirically evaluate our algorithms in case studies on problems including dynamic power management and volatility-based index trading. Our results demonstrate that prediction-specific, strongly-optimal algorithms can significantly improve performance across a variety of online decision-making settings.
OCJun 13, 2025
Quantum Learning and Estimation for Distribution Networks and Energy Communities CoordinationYingrui Zhuang, Lin Cheng, Yuji Cao et al.
Price signals from distribution networks (DNs) guide energy communities (ECs) to adjust energy usage, enabling effective coordination for reliable power system operation. However, this coordination faces significant challenges due to the limited availability of information (i.e., only the aggregated energy usage of ECs is available to DNs), and the high computational burden of accounting for uncertainties and the associated risks through numerous scenarios. To address these challenges, we propose a quantum learning and estimation approach to enhance coordination between DNs and ECs. Specifically, leveraging advanced quantum properties such as quantum superposition and entanglement, we develop a hybrid quantum temporal convolutional network-long short-term memory (Q-TCN-LSTM) model to establish an end-to-end mapping between ECs' responses and the price incentives from DNs. Moreover, we develop a quantum estimation method based on quantum amplitude estimation (QAE) and two phase-rotation circuits to significantly accelerate the optimization process under numerous uncertainty scenarios. Numerical experiments demonstrate that, compared to classical neural networks, the proposed Q-TCN-LSTM model improves the mapping accuracy by 69.2% while reducing the model size by 99.75% and the computation time by 93.9%. Compared to classical Monte Carlo simulation, QAE achieves comparable accuracy with a dramatic reduction in computational time (up to 99.99%) and requires significantly fewer computational resources.
AIMay 28, 2025
Rethinking the Unsolvable: When In-Context Search Meets Test-Time ScalingFanzeng Xia, Yidong Luo, Tinko Sebastian Bartels et al.
Recent research has highlighted that Large Language Models (LLMs), even when trained to generate extended long reasoning steps, still face significant challenges on hard reasoning problems. However, much of the existing literature relies on direct prompting with simple in-context learning examples for evaluation, which largely overlooks advanced techniques to elicit LLMs' deliberate reasoning before drawing conclusions that LLMs hit a performance ceiling. In this paper, we systematically explore the combined potential of in-context search and test-time scaling on super hard reasoning tasks. We find that by employing advanced in-context search prompting to LLMs augmented with internal scaling, one can achieve transformative performance breakthroughs on tasks previously deemed "unsolvable" (e.g., reported success rates below 5%). We provide both empirical results and theoretical analysis of how this combination can unleash LLM reasoning capabilities: i) Empirically, on controlled NP-hard tasks and complex real-world planning benchmarks, our approach achieves up to a 30x improvement in success rates compared to previously reported results without any external mechanisms; ii) Theoretically, we show that in-context search prompting, when combined with internal scaling, significantly extends the complexity class of solvable reasoning problems. These findings challenge prevailing assumptions about the limitations of LLMs on complex tasks, indicating that current evaluation paradigms systematically underestimate their true potential. Our work calls for a critical reassessment of how LLM reasoning is benchmarked and a more robust evaluation strategy that fully captures the true capabilities of contemporary LLMs, which can lead to a better understanding of their operational reasoning boundaries in real-world deployments.
LGNov 12, 2024
Safe Exploitative Play with Untrusted Type BeliefsTongxin Li, Tinashe Handina, Shaolei Ren et al.
The combination of the Bayesian game and learning has a rich history, with the idea of controlling a single agent in a system composed of multiple agents with unknown behaviors given a set of types, each specifying a possible behavior for the other agents. The idea is to plan an agent's own actions with respect to those types which it believes are most likely to maximize the payoff. However, the type beliefs are often learned from past actions and likely to be incorrect. With this perspective in mind, we consider an agent in a game with type predictions of other components, and investigate the impact of incorrect beliefs to the agent's payoff. In particular, we formally define a tradeoff between risk and opportunity by comparing the payoff obtained against the optimal payoff, which is represented by a gap caused by trusting or distrusting the learned beliefs. Our main results characterize the tradeoff by establishing upper and lower bounds on the Pareto front for both normal-form and stochastic Bayesian games, with numerical results provided.
DCMay 27, 2021
TENSILE: A Tensor granularity dynamic GPU memory scheduling method toward multiple dynamic workloads systemKaixin Zhang, Hongzhi Wang, Han Hu et al.
Recently, deep learning has been an area of intense research. However, as a kind of computing-intensive task, deep learning highly relies on the scale of GPU memory, which is usually prohibitive and scarce. Although some extensive works have been proposed for dynamic GPU memory management, they are hard to apply to systems with multiple dynamic workloads, such as in-database machine learning systems. In this paper, we demonstrated TENSILE, a method of managing GPU memory in tensor granularity to reduce the GPU memory peak, considering the multiple dynamic workloads. TENSILE tackled the cold-starting and across-iteration scheduling problem existing in previous works. We implemented TENSILE on a deep learning framework built by ourselves and evaluated its performance. The experiment results show that TENSILE can save more GPU memory with less extra overhead than prior works in single and multiple dynamic workloads scenarios.
LGJan 17, 2021
Disentangling Observed Causal Effects from Latent Confounders using Method of MomentsAnqi Liu, Hao Liu, Tongxin Li et al.
Discovering the complete set of causal relations among a group of variables is a challenging unsupervised learning problem. Often, this challenge is compounded by the fact that there are latent or hidden confounders. When only observational data is available, the problem is ill-posed, i.e. the causal relationships are non-identifiable unless strong modeling assumptions are made. When interventions are available, we provide guarantees on identifiability and learnability under mild assumptions. We assume a linear structural equation model (SEM) with independent latent factors and directed acyclic graph (DAG) relationships among the observables. Since the latent variable inference is based on independent component analysis (ICA), we call this model SEM-ICA. We use the method of moments principle to establish model identifiability. We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions. Thus, we provide a principled approach to tackling the joint problem of causal discovery and latent variable inference.
CRMay 12, 2020
Towards Memory Safe Python Enclave for Security Sensitive ComputationHuibo Wang, Mingshen Sun, Qian Feng et al.
Intel SGX Guard eXtensions (SGX), a hardware-supported trusted execution environment (TEE), is designed to protect security-sensitive applications. However, since enclave applications are developed with memory unsafe languages such as C/C++, traditional memory corruption is not eliminated in SGX. Rust-SGX is the first toolkit providing enclave developers with a memory-language. However, Rust is considered a Systems language and has become the right choice for concurrent applications and web browsers. Many application domains such as Big Data, Machine Learning, Robotics, Computer Vision are more commonly developed in the python programming language. Therefore, Python application developers cannot benefit from secure enclaves like Intel SGX and rust-SGX. To fill this gap, we propose Python-SGX, which is a memory-safe SGX SDK providing enclave developers a memory-safe Python development environment. The key idea is to enable memory-safe Python language in SGX by solving the following key challenges: (1) defining a memory-safe Python interpreter (2)replacing unsafe elements of Python interpreter with safe ones,(3) achieving comparable performance to non-enclave Python applications, and (4) not introducing any unsafe new code or libraries into SGX. We propose to build Python-SGX with PyPy, a Python interpreter written by RPython, which is a subset of Python, and tame unsafe parts in PyPy by formal verification, security hardening, and memory safe language. We have implemented python-SGX and tested it with a series of benchmarks programs. Our evaluation results show that Python-SGX does not cause significant overhead.
CRMay 26, 2015
Unauthorized Cross-App Resource Access on MAC OS X and iOSLuyi Xing, Xiaolong Bai, Tongxin Li et al.
On modern operating systems, applications under the same user are separated from each other, for the purpose of protecting them against malware and compromised programs. Given the complexity of today's OSes, less clear is whether such isolation is effective against different kind of cross-app resource access attacks (called XARA in our research). To better understand the problem, on the less-studied Apple platforms, we conducted a systematic security analysis on MAC OS~X and iOS. Our research leads to the discovery of a series of high-impact security weaknesses, which enable a sandboxed malicious app, approved by the Apple Stores, to gain unauthorized access to other apps' sensitive data. More specifically, we found that the inter-app interaction services, including the keychain, WebSocket and NSConnection on OS~X and URL Scheme on the MAC OS and iOS, can all be exploited by the malware to steal such confidential information as the passwords for iCloud, email and bank, and the secret token of Evernote. Further, the design of the app sandbox on OS~X was found to be vulnerable, exposing an app's private directory to the sandboxed malware that hijacks its Apple Bundle ID. As a result, sensitive user data, like the notes and user contacts under Evernote and photos under WeChat, have all been disclosed. Fundamentally, these problems are caused by the lack of app-to-app and app-to-OS authentications. To better understand their impacts, we developed a scanner that automatically analyzes the binaries of MAC OS and iOS apps to determine whether proper protection is missing in their code. Running it on hundreds of binaries, we confirmed the pervasiveness of the weaknesses among high-impact Apple apps. Since the issues may not be easily fixed, we built a simple program that detects exploit attempts on OS~X, helping protect vulnerable apps before the problems can be fully addressed.