AIFeb 18
OpenSage: Self-programming Agent Generation EngineHongwei Li, Zhun Wang, Qinrun Dai et al.
Agent development kits (ADKs) provide effective platforms and tooling for constructing agents, and their designs are critical to the constructed agents' performance, especially the functionality for agent topology, tools, and memory. However, current ADKs either lack sufficient functional support or rely on humans to manually design these components, limiting agents' generalizability and overall performance. We propose OpenSage, the first ADK that enables LLMs to automatically create agents with self-generated topology and toolsets while providing comprehensive and structured memory support. OpenSage offers effective functionality for agents to create and manage their own sub-agents and toolkits. It also features a hierarchical, graph-based memory system for efficient management and a specialized toolkit tailored to software engineering tasks. Extensive experiments across three state-of-the-art benchmarks with various backbone models demonstrate the advantages of OpenSage over existing ADKs. We also conduct rigorous ablation studies to demonstrate the effectiveness of our design for each component. We believe OpenSage can pave the way for the next generation of agent development, shifting the focus from human-centered to AI-centered paradigms.
CRNov 4, 2019Code
SpecuSym: Speculative Symbolic Execution for Cache Timing Leak DetectionShengjian Guo, Yueqi Chen, Peng Li et al.
CPU cache is a limited but crucial storage component in modern processors, whereas the cache timing side-channel may inadvertently leak information through the physically measurable timing variance. Speculative execution, an essential processor optimization, and a source of such variances, can cause severe detriment on deliberate branch mispredictions. Despite static analysis could qualitatively verify the timing-leakage-free property under speculative execution, it is incapable of producing endorsements including inputs and speculated flows to diagnose leaks in depth. This work proposes a new symbolic execution based method, SpecuSym, for precisely detecting cache timing leaks introduced by speculative execution. Given a program (leakage-free in non-speculative execution), SpecuSymsystematically explores the program state space, models speculative behavior at conditional branches, and accumulates the cache side effects along with subsequent path explorations. During the dynamic execution, SpecuSymconstructs leak predicates for memory visits according to the specified cache model and conducts a constraint-solving based cache behavior analysis to inspect the new cache behaviors. We have implementedSpecuSymatop KLEE and evaluated it against 15 open-source benchmarks. Experimental results show thatSpecuSymsuccessfully detected from 2 to 61 leaks in 6 programs under 3 different cache settings and identified false positives in 2 programs reported by recent work.
OSJan 11, 2024
When eBPF Meets Machine Learning: On-the-fly OS Kernel CompartmentalizationZicheng Wang, Tiejin Chen, Qinrun Dai et al.
Compartmentalization effectively prevents initial corruption from turning into a successful attack. This paper presents O2C, a pioneering system designed to enforce OS kernel compartmentalization on the fly. It not only provides immediate remediation for sudden threats but also maintains consistent system availability through the enforcement process. O2C is empowered by the newest advancements of the eBPF ecosystem which allows to instrument eBPF programs that perform enforcement actions into the kernel at runtime. O2C takes the lead in embedding a machine learning model into eBPF programs, addressing unique challenges in on-the-fly compartmentalization. Our comprehensive evaluation shows that O2C effectively confines damage within the compartment. Further, we validate that decision tree is optimally suited for O2C owing to its advantages in processing tabular data, its explainable nature, and its compliance with the eBPF ecosystem. Last but not least, O2C is lightweight, showing negligible overhead and excellent sacalability system-wide.