DCMay 24Code
Efficient Distributed MLLM Training with CornstarchInsu Jang, Runyu Lu, Nikhil Bansal et al.
Multimodal large language models (MLLMs) extend the capabilities of large language models (LLMs) by combining heterogeneous model architectures to handle diverse modalities like images and audio. However, this inherent heterogeneity in MLLM model structure and data types makes makeshift extensions to existing LLM training frameworks unsuitable for efficient MLLM training. While there are a few works that have attempted to address the heterogeneity in MLLM training, their approaches are limited to only superficially considering the characteristics of MLLMs. In this paper, we present Cornstarch, an efficient distributed MLLM training framework that contemplates MLLM's unique characteristics in both model and data parallelization. Cornstarch introduces frozen-aware pipeline parallelism and token workload-balanced context parallelism to improve MLLM training throughput. Our extensive evaluation shows that Cornstarch outperforms state-of-the-art solutions by $2.26\times$ on average in terms of MLLM training throughput. Cornstarch is an open-source project available at https://github.com/cornstarch-org/Cornstarch.
LGOct 24, 2022Code
Symbolic Distillation for Learned TCP Congestion ControlS P Sharan, Wenqing Zheng, Kuo-Feng Hsu et al.
Recent advances in TCP congestion control (CC) have achieved tremendous success with deep reinforcement learning (RL) approaches, which use feedforward neural networks (NN) to learn complex environment conditions and make better decisions. However, such "black-box" policies lack interpretability and reliability, and often, they need to operate outside the traditional TCP datapath due to the use of complex NNs. This paper proposes a novel two-stage solution to achieve the best of both worlds: first to train a deep RL agent, then distill its (over-)parameterized NN policy into white-box, light-weight rules in the form of symbolic expressions that are much easier to understand and to implement in constrained environments. At the core of our proposal is a novel symbolic branching algorithm that enables the rule to be aware of the context in terms of various network conditions, eventually converting the NN policy into a symbolic tree. The distilled symbolic rules preserve and often improve performance over state-of-the-art NN policies while being faster and simpler than a standard neural network. We validate the performance of our distilled symbolic rules on both simulation and emulation environments. Our code is available at https://github.com/VITA-Group/SymbolicPCC.
SYMay 6
Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific DiscoveryZhenning Yang, Yuhan Chen, Patrick Tser Jern Kon et al.
To unleash the full potential of AI for Science, we must untether the agents from a purely digital environment. The agent's ability to control and explore in real-world labs is essential because the physical lab remains foundational to scientific discovery. While some tasks can be performed on a computer (e.g., data analysis, running simulated experiments), Eureka moments could occur at any time while operating lab instruments (e.g., when a scientist notices unexpected clues, intuition may prompt a real-time course change). Although autonomous labs are on the rise, which expose programmable APIs to control scientific instruments via software, bridging the gap between increasingly powerful AI agents and automated lab equipment requires innovation that draws insights from computer systems. We propose a new paradigm called ``Experiment-as-Code (EaC) Labs,'' where a core concept is to encode experiments as declarative configurations that can be compiled down to device-level APIs. AI agents come up with hypotheses and experiments, written as an ensemble of declarative configurations. The systems layer performs program analysis, safety checks, resource assignment, and job orchestration. Finally, programmatic experimentation occurs via actuating the device APIs. This is a general stack that is science-, lab-, and instrument-independent, representing a novel synthesis across the physical, systems, and intelligence layers to unleash the next breakthrough in AI for Science.
SYMay 5
Grid Integration of AI Data Centers: A Critical Review of Energy Storage SolutionsSina Mohammadi, Wayne Wang, Marcus Chen I Wada et al.
Artificial intelligence (AI) is driving unprecedented growth in data center (DC) scale and power demand. AI workloads impose highly dynamic, difficult-to-forecast power profiles on the utility grid, creating reliability and stability challenges that conventional DC architectures are not designed to address. This paper provides a critical review of energy storage systems (ESSs) as the key enabling technology for reliable grid integration of AI DCs. We organize the review around a four-layer hierarchical taxonomy, namely chip-level buffering, rack/server-level ESSs, facility-level uninterruptible power supply (UPS) systems, and grid-scale battery energy storage systems (BESSs), supplemented by non-battery technologies including fuel cells (FCs) and thermal energy storage (TES). Each layer is analyzed with respect to response timescale, power and energy ratings, operational role, integration challenges, and coordination requirements. Key findings include: (i) AI DC load profiles differ fundamentally from traditional loads in their sub-second variability, making conventional ESS dispatch strategies insufficient; (ii) hierarchical, coordinated ESS deployment across all layers is necessary for effective load smoothing and grid support; and (iii) significant gaps remain in simulation tools, degradation modeling, load forecasting, and optimal multi-layer sizing. This review identifies open research challenges and future directions at the intersection of AI computing infrastructure and power system integration.
AIFeb 22, 2025Code
Curie: Toward Rigorous and Automated Scientific Experimentation with AI AgentsPatrick Tser Jern Kon, Jiachen Liu, Qiuyi Ding et al.
Scientific experimentation, a cornerstone of human progress, demands rigor in reliability, methodical control, and interpretability to yield meaningful results. Despite the growing capabilities of large language models (LLMs) in automating different aspects of the scientific process, automating rigorous experimentation remains a significant challenge. To address this gap, we propose Curie, an AI agent framework designed to embed rigor into the experimentation process through three key components: an intra-agent rigor module to enhance reliability, an inter-agent rigor module to maintain methodical control, and an experiment knowledge module to enhance interpretability. To evaluate Curie, we design a novel experimental benchmark composed of 46 questions across four computer science domains, derived from influential research papers, and widely adopted open-source projects. Compared to the strongest baseline tested, we achieve a 3.4$\times$ improvement in correctly answering experimental questions. Curie is open-sourced at https://github.com/Just-Curieous/Curie.
IRSep 28, 2024
Disaggregating Embedding Recommendation Systems with FlexEMRYibo Huang, Zhenning Yang, Jiarong Xing et al.
Efficiently serving embedding-based recommendation (EMR) models remains a significant challenge due to their increasingly large memory requirements. Today's practice splits the model across many monolithic servers, where a mix of GPUs, CPUs, and DRAM is provisioned in fixed proportions. This approach leads to suboptimal resource utilization and increased costs. Disaggregating embedding operations from neural network inference is a promising solution but raises novel networking challenges. In this paper, we discuss the design of FlexEMR for optimized EMR disaggregation. FlexEMR proposes two sets of techniques to tackle the networking challenges: Leveraging the temporal and spatial locality of embedding lookups to reduce data movement over the network, and designing an optimized multi-threaded RDMA engine for concurrent lookup subrequests. We outline the design space for each technique and present initial results from our early prototype.
AIMay 30, 2025Code
EXP-Bench: Can AI Conduct AI Research Experiments?Patrick Tser Jern Kon, Jiachen Liu, Xinyi Zhu et al.
Automating AI research holds immense potential for accelerating scientific progress, yet current AI agents struggle with the complexities of rigorous, end-to-end experimentation. We introduce EXP-Bench, a novel benchmark designed to systematically evaluate AI agents on complete research experiments sourced from influential AI publications. Given a research question and incomplete starter code, EXP-Bench challenges AI agents to formulate hypotheses, design and implement experimental procedures, execute them, and analyze results. To enable the creation of such intricate and authentic tasks with high-fidelity, we design a semi-autonomous pipeline to extract and structure crucial experimental details from these research papers and their associated open-source code. With the pipeline, EXP-Bench curated 461 AI research tasks from 51 top-tier AI research papers. Evaluations of leading LLM-based agents, such as OpenHands and IterativeAgent on EXP-Bench demonstrate partial capabilities: while scores on individual experimental aspects such as design or implementation correctness occasionally reach 20-35%, the success rate for complete, executable experiments was a mere 0.5%. By identifying these bottlenecks and providing realistic step-by-step experiment procedures, EXP-Bench serves as a vital tool for future AI agents to improve their ability to conduct AI research experiments. EXP-Bench is open-sourced at https://github.com/Just-Curieous/Curie/tree/main/benchmark/exp_bench.
SEApr 1
Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code SynthesisZhenning Yang, Kaden Gruizenga, Tongyuan Miao et al.
The scale and complexity of modern cloud infrastructure have made Infrastructure-as-Code (IaC) essential for managing deployments. While large Language models (LLMs) are increasingly being used to generate IaC configurations from natural language, user requests are often underspecified. Unlike traditional code generation, IaC configurations cannot be executed cheaply or iteratively repaired, forcing the LLMs into an almost one-shot regime. We observe that ambiguity in IaC exhibits a tractable compositional structure: configurations decompose into three hierarchical axes (resources, topology, attributes) where higher-level decisions constrain lower-level ones. We propose a training-free, disagreement-driven framework that generates diverse candidate specifications, identifies structural disagreements across these axes, ranks them by informativeness, and produces targeted clarification questions that progressively narrow the configuration space. We introduce \textsc{Ambig-IaC}, a benchmark of 300 validated IaC tasks with ambiguous prompts, and an evaluation framework based on graph edit distance and embedding similarity. Our method outperforms the strongest baseline, achieving relative improvements of +18.4\% and +25.4\% on structure and attribute evaluations, respectively.
CVApr 10Code
EpiAgent: An Agent-Centric System for Ancient Inscription RestorationShipeng Zhu, Ang Chen, Na Nie et al.
Ancient inscriptions, as repositories of cultural memory, have suffered from centuries of environmental and human-induced degradation. Restoring their intertwined visual and textual integrity poses one of the most demanding challenges in digital heritage preservation. However, existing AI-based approaches often rely on rigid pipelines, struggling to generalize across such complex and heterogeneous real-world degradations. Inspired by the skill-coordinated workflow of human epigraphers, we propose EpiAgent, an agent-centric system that formulates inscription restoration as a hierarchical planning problem. Following an Observe-Conceive-Execute-Reevaluate paradigm, an LLM-based central planner orchestrates collaboration among multimodal analysis, historical experience, specialized restoration tools, and iterative self-refinement. This agent-centric coordination enables a flexible and adaptive restoration process beyond conventional single-pass methods. Across real-world degraded inscriptions, EpiAgent achieves superior restoration quality and stronger generalization compared to existing methods. Our work marks an important step toward expert-level agent-driven restoration of cultural heritage. The code is available at https://github.com/blackprotoss/EpiAgent.
SEFeb 25
SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering AgentsPatrick Tser Jern Kon, Archana Pradeep, Ang Chen et al.
Small language models (SLMs) offer compelling advantages in cost, latency, and adaptability, but have so far lagged behind larger models on long-horizon software engineering tasks such as SWE-bench, where they suffer from pervasive action looping and low resolution rates. We introduce SWE-Protégé, a post-training framework that reframes software repair as an expert-protégé collaboration problem. In SWE-Protégé, an SLM remains the sole decision-maker while learning to selectively seek guidance from a strong expert model, recognize stalled states, and follow through on expert feedback. Our approach combines supervised fine-tuning on expert-augmented trajectories with agentic reinforcement learning that explicitly discourages degenerative looping and unproductive expert collaboration. We lightly post-train Qwen2.5-Coder-7B-Instruct to achieve 42.4% Pass@1 on SWE-bench Verified, a +25.4% improvement over the prior SLM state of the art, while using expert assistance sparsely (~4 calls per task and 11% of total tokens).
LGApr 27
The Last Human-Written Paper: Agent-Native Research ArtifactsJiachen Liu, Jiaxin Pei, Jintao Huang et al.
Scientific publication compresses a branching, iterative research process into a linear narrative, discarding the majority of what was discovered along the way. This compilation imposes two structural costs: a Storytelling Tax, where failed experiments, rejected hypotheses, and the branching exploration process are discarded to fit a linear narrative; and an Engineering Tax, where the gap between reviewer-sufficient prose and agent-sufficient specification leaves critical implementation details unwritten. Tolerable for human readers, these costs become critical when AI agents must understand, reproduce, and extend published work. We introduce the Agent-Native Research Artifact (Ara), a protocol that replaces the narrative paper with a machine-executable research package structured around four layers: scientific logic, executable code with full specifications, an exploration graph that preserves the failures compilation discards, and evidence grounding every claim in raw outputs. Three mechanisms support the ecosystem: a Live Research Manager that captures decisions and dead ends during ordinary development; an Ara Compiler that translates legacy PDFs and repos into Aras; and an Ara-native review system that automates objective checks so human reviewers can focus on significance, novelty, and taste. On PaperBench and RE-Bench, Ara raises question-answering accuracy from 72.4% to 93.7% and reproduction success from 57.4% to 64.4%. On RE-Bench's five open-ended extension tasks, preserved failure traces in Ara accelerate progress, but can also constrain a capable agent from stepping outside the prior-run box depending on the agent's capabilities.
AIJun 13, 2025
Cloud Infrastructure Management in the Age of AI AgentsZhenning Yang, Archit Bhatnagar, Yiming Qiu et al.
Cloud infrastructure is the cornerstone of the modern IT industry. However, managing this infrastructure effectively requires considerable manual effort from the DevOps engineering team. We make a case for developing AI agents powered by large language models (LLMs) to automate cloud infrastructure management tasks. In a preliminary study, we investigate the potential for AI agents to use different cloud/user interfaces such as software development kits (SDK), command line interfaces (CLI), Infrastructure-as-Code (IaC) platforms, and web portals. We report takeaways on their effectiveness on different management tasks, and identify research challenges and potential solutions.
DBMay 25, 2025
SQUiD: Synthesizing Relational Databases from Unstructured TextMushtari Sadia, Zhenning Yang, Yunming Xiao et al.
Relational databases are central to modern data management, yet most data exists in unstructured forms like text documents. To bridge this gap, we leverage large language models (LLMs) to automatically synthesize a relational database by generating its schema and populating its tables from raw text. We introduce SQUiD, a novel neurosymbolic framework that decomposes this task into four stages, each with specialized techniques. Our experiments show that SQUiD consistently outperforms baselines across diverse datasets.
CVDec 15, 2023
Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding OptimizationYige Chen, Teng Hu, Yizhe Tang et al.
With the help of Score Distillation Sampling (SDS) and the rapid development of neural 3D representations, some methods have been proposed to perform 3D editing such as adding additional geometries, or overwriting textures. However, generalized 3D non-rigid editing task, which requires changing both the structure (posture or composition) and appearance (texture) of the original object, remains to be challenging in 3D editing field. In this paper, we propose Plasticine3D, a novel text-guided fine-grained controlled 3D editing pipeline that can perform 3D non-rigid editing with large structure deformations. Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance. In order to maintain the details of the original object from different viewpoints, we propose a Multi-View-Embedding (MVE) Optimization strategy to ensure that the guidance model learns the features of the original object from various viewpoints. For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space, and control the extent of editing by adjusting the fusion rate. Furthermore, in order to address the issue of gradual loss of details during the generation process under high editing intensity, as well as the problem of insignificant editing effects in some scenarios, we propose Score Projection Sampling (SPS) as a replacement of score distillation sampling, which introduces additional optimization phases for editing target enhancement and original detail maintenance, leading to better editing quality. Extensive experiments demonstrate the effectiveness of our method on 3D non-rigid editing tasks
SEOct 23, 2025
Automated Cloud Infrastructure-as-Code Reconciliation with AI AgentsZhenning Yang, Hui Guan, Victor Nicolet et al.
Cloud infrastructure is managed through a mix of interfaces -- traditionally, cloud consoles, command-line interfaces (CLI), and SDKs are the tools of choice. Recently, Infrastructure-as-Code/IaC frameworks (e.g., Terraform) have quickly gained popularity. Unlike conventional tools, IaC~frameworks encode the infrastructure in a "source-of-truth" configuration. They are capable of automatically carrying out modifications to the cloud -- deploying, updating, or destroying resources -- to bring the actual infrastructure into alignment with the IaC configuration. However, when IaC is used alongside consoles, CLIs, or SDKs, it loses visibility into external changes, causing infrastructure drift, where the configuration becomes outdated, and later IaC operations may undo valid updates or trigger errors. We present NSync, an automated system for IaC reconciliation that propagates out-of-band changes back into the IaC program. Our key insight is that infrastructure changes eventually all occur via cloud API invocations -- the lowest layer for cloud management operations. NSync gleans insights from API traces to detect drift (i.e., non-IaC changes) and reconcile it (i.e., update the IaC configuration to capture the changes). It employs an agentic architecture that leverages LLMs to infer high-level intents from noisy API sequences, synthesize targeted IaC updates using specialized tools, and continually improve through a self-evolving knowledge base of past reconciliations. We further introduce a novel evaluation pipeline for injecting realistic drifts into cloud infrastructure and assessing reconciliation performance. Experiments across five real-world Terraform projects and 372 drift scenarios show that NSync outperforms the baseline both in terms of accuracy (from 0.71 to 0.97 pass@3) and token efficiency (1.47$\times$ improvement).
LGOct 2, 2025
TetriServe: Efficient DiT Serving for Heterogeneous Image GenerationRunyu Lu, Shiqi He, Wenxuan Tan et al.
Diffusion Transformer (DiT) models excel at generating highquality images through iterative denoising steps, but serving them under strict Service Level Objectives (SLOs) is challenging due to their high computational cost, particularly at large resolutions. Existing serving systems use fixed degree sequence parallelism, which is inefficient for heterogeneous workloads with mixed resolutions and deadlines, leading to poor GPU utilization and low SLO attainment. In this paper, we propose step-level sequence parallelism to dynamically adjust the parallel degree of individual requests according to their deadlines. We present TetriServe, a DiT serving system that implements this strategy for highly efficient image generation. Specifically, TetriServe introduces a novel round-based scheduling mechanism that improves SLO attainment: (1) discretizing time into fixed rounds to make deadline-aware scheduling tractable, (2) adapting parallelism at the step level and minimize GPU hour consumption, and (3) jointly packing requests to minimize late completions. Extensive evaluation on state-of-the-art DiT models shows that TetriServe achieves up to 32% higher SLO attainment compared to existing solutions without degrading image quality.
DBSep 18, 2025
A Case for Computing on Unstructured DataMushtari Sadia, Amrita Roy Chowdhury, Ang Chen
Unstructured data, such as text, images, audio, and video, comprises the vast majority of the world's information, yet it remains poorly supported by traditional data systems that rely on structured formats for computation. We argue for a new paradigm, which we call computing on unstructured data, built around three stages: extraction of latent structure, transformation of this structure through data processing techniques, and projection back into unstructured formats. This bi-directional pipeline allows unstructured data to benefit from the analytical power of structured computation, while preserving the richness and accessibility of unstructured representations for human and AI consumption. We illustrate this paradigm through two use cases and present the research components that need to be developed in a new data system called MXFlow.
RONov 3, 2021
A Novel Actuation Strategy for an Agile Bio-inspired FWAV Performing a Morphing-coupled Wingbeat PatternAng Chen, Bifeng Song, Zhihe Wang et al.
Flying vertebrates exhibit sophisticated wingbeat kinematics. Their specialized forelimbs allow for the wing morphing motion to couple with the flapping motion during their level flight, Previous flyable bionic platforms have successfully applied bio-inspired wing morphing but cannot yet be propelled by the morphing-coupled wingbeat pattern. Spurred by this, we develop a bio-inspired flapping-wing aerial vehicle (FWAV) entitled RoboFalcon, which is equipped with a novel mechanism to drive the bat-style morphing wings, performs a morphing-coupled wingbeat pattern, and overall manages an appealing flight. The novel mechanism of RoboFalcon allows coupling the morphing and flapping during level flight and decoupling these when maneuvering is required, producing a bilateral asymmetric downstroke affording high rolling agility. The bat-style morphing wing is designed with a tilted mounting angle around the radius at the wrist joint to mimic the wrist supination and pronation effect of flying vertebrates' forelimbs. The agility of RoboFalcon is assessed through several rolling maneuver flight tests, and we demonstrate its well-performing agility capability compared to flying creatures and current flapping-wing platforms. Wind tunnel tests indicate that the roll moment of the asymmetric downstroke is correlated with the flapping frequency, and the wrist mounting angle can be used for tuning the angle of attack and lift-thrust configuration of the equilibrium flight state. We believe that this work yields a well-performing bionic platform and provides a new actuation strategy for the morphing-coupled flapping flight.
DCOct 25, 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native PerformanceJiarong Xing, Leyuan Wang, Shang Zhang et al.
Today's auto-tuners (e.g., AutoTVM, Ansor) generate efficient tensor programs by navigating a large search space to identify effective implementations, but they do so with opaque hardware details. Thus, their performance could fall behind that of hardware-native libraries (e.g., cuBLAS, cuDNN), which are hand-optimized by device vendors to extract high performance. On the other hand, these vendor libraries have a fixed set of supported functions and lack the customization and automation support afforded by auto-tuners. Bolt is based on the recent trend that vendor libraries are increasingly modularized and reconfigurable via declarative control (e.g., CUTLASS). It enables a novel approach that bridges this gap and achieves the best of both worlds, via hardware-native templated search. Bolt provides new opportunities to rethink end-to-end tensor optimizations at the graph, operator, and model levels. Bolt demonstrates this concept by prototyping on a popular auto-tuner in TVM and a class of widely-used platforms (i.e., NVIDIA GPUs) -- both in large deployment in our production environment. Bolt improves the inference speed of common convolutional neural networks by 2.5x on average over the state of the art, and it auto-tunes these models within 20 minutes.
CRDec 14, 2020
The Design and Implementation of a Verified File System with End-to-End Data IntegrityDaniel W. Song, Konstantinos Mamouras, Ang Chen et al.
Despite significant research and engineering efforts, many of today's important computer systems suffer from bugs. To increase the reliability of software systems, recent work has applied formal verification to certify the correctness of such systems, with recent successes including certified file systems and certified cryptographic protocols, albeit using quite different proof tactics and toolchains. Unifying these concepts, we present the first certified file system that uses cryptographic primitives to protect itself against tampering. Our certified file system defends against adversaries that might wish to tamper with the raw disk. Such an "untrusted storage" threat model captures the behavior of storage devices that might silently return erroneous bits as well as adversaries who might have limited access to a disk, perhaps while in transit. In this paper, we present IFSCQ, a certified cryptographic file system with strong integrity guarantees. IFSCQ combines and extends work on cryptographic file systems and formally certified file systems to prove that our design is correct. It is the first certified file system that is secure against strong adversaries that can maliciously corrupt on-disk data and metadata, including attempting to roll back the disk to earlier versions of valid data. IFSCQ achieves this by constructing a Merkle hash tree of the whole disk, and by proving that tampered disk blocks will always be detected if they ever occur. We demonstrate that IFSCQ runs with reasonable overhead while detecting several kinds of attacks.
NIAug 4, 2019
Programmable In-Network Security for Context-aware BYOD PoliciesQiao Kang, Lei Xue, Adam Morrison et al.
Bring Your Own Device (BYOD) has become the new norm in enterprise networks, but BYOD security remains a top concern. Context-aware security, which enforces access control based on dynamic runtime context, holds much promise. Recent work has developed SDN solutions to collect device context for network-wide access control in a central controller. However, the central controller poses a bottleneck that can become an attack target, and processing context changes at remote software has low agility. We present a new paradigm, programmable in-network security (Poise), which is enabled by the emergence of programmable switches. At the heart of Poise is a novel switch primitive, which can be programmed to support a wide range of context-aware policies in hardware. Users of Poise specify concise policies, and Poise compiles them into different instantiations of the security primitive in P4. Compared to centralized SDN defenses, Poise is resilient to control plane saturation attacks, and it dramatically increases defense agility.
CRDec 3, 2018
An Historical Analysis of the SEAndroid Policy EvolutionBumjin Im, Ang Chen, Dan Wallach
Android adopted SELinux's mandatory access control (MAC) mechanisms in 2013. Since then, billions of Android devices have benefited from mandatory access control security policies. These policies are expressed in a variety of rules, maintained by Google and extended by Android OEMs. Over the years, the rules have grown to be quite complex, making it challenging to properly understand or configure these policies. In this paper, we perform a measurement study on the SEAndroid repository to understand the evolution of these policies. We propose a new metric to measure the complexity of the policy by expanding policy rules, with their abstraction features such as macros and groups, into primitive "boxes", which we then use to show that the complexity of the SEAndroid policies has been growing exponentially over time. By analyzing the Git commits, snapshot by snapshot, we are also able to analyze the "age" of policy rules, the trend of changes, and the contributor composition. We also look at hallmark events in Android's history, such as the "Stagefright" vulnerability in Android's media facilities, pointing out how these events led to changes in the MAC policies. The growing complexity of Android's mandatory policies suggests that we will eventually hit the limits of our ability to understand these policies, requiring new tools and techniques.