HCAug 24, 2023
Project Aria: A New Tool for Egocentric Multi-Modal AI ResearchJakob Engel, Kiran Somasundaram, Michael Goesele et al. · mit
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data.
CVFeb 20, 2024Code
Aria Everyday Activities DatasetZhaoyang Lv, Nicholas Charron, Pierre Moulon et al.
We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data including high frequency globally aligned 3D trajectories, scene point cloud, per-frame 3D eye gaze vector and time aligned speech transcription. In this paper, we demonstrate a few exemplar research applications enabled by this dataset, including neural scene reconstruction and prompted segmentation. AEA is an open source dataset that can be downloaded from https://www.projectaria.com/datasets/aea/. We are also providing open-source implementations and examples of how to use the dataset in Project Aria Tools https://github.com/facebookresearch/projectaria_tools.
53.6ARApr 18
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIRYuyang Zou, Youwei Xiao, Chenyun Yin et al.
Application-Specific Instruction-Set Processors (ASIPs) built on the RISC-V architecture offer specialization opportunities for various applications. Existing frameworks are largely designed around fixed instruction extension interfaces and rely on manual software adaptation. However, as emerging domains scale up in complexity, two major challenges arise. First, memory access remains a primary bottleneck as existing design flows lack architectural awareness of memory interfaces, leading to suboptimal interface selection and orchestration. Second, the semantic complexity of custom instruction extensions, characterized by non-trivial control logic and irregular memory behaviors, hinders the ability of conventional compilers to perform automated and comprehensive offloading. We present Aquas, a holistic hardware-software co-design framework built upon MLIR. Aquas proposes a memory interface model that jointly considers interface characteristics and cache effects, along with an interface-aware synthesis flow guided by this model that progressively optimizes the input specification and generates efficient hardware implementations. We also propose an e-graph-based retargetable compiler approach with a novel matching engine for efficient instruction mapping and offloading, enabling robust and effective utilization of custom instruction capabilities. Case studies across four diverse domains show that Aquas delivers substantial acceleration, achieving up to 15.61x speedup with 14.5% area overhead and zero frequency degradation, proving highly competitive in domain acceleration against more powerful general-purpose cores and vector extensions.
27.5AIApr 19
LLM-Guided Strategy Synthesis for Scalable Equality SaturationChenyun Yin, Youwei Xiao, Yuze Luo et al.
Equality saturation (EqSat) is a powerful optimization paradigm that compactly represents many equivalent programs in an e-graph and delays commitment until extraction selects a lowest-cost program. Making EqSat effective, therefore, requires not only domain-specific rewrite rules but also domain-specific strategies. Today, much of this strategy design is still manual, making it a major obstacle to automating e-graph-based compilers. Recent rule-synthesis frameworks can automatically infer large rewrite vocabularies from semantic specifications, but they also enlarge the rewrite space and further exacerbate e-graph explosion. Although large language models (LLMs) make automated strategy synthesis plausible, directly evolving backend code remains ineffective in practice. The search lacks reusable strategy abstractions and actionable feedback, and can easily trigger e-graph explosion or converge to poor designs. We present EggMind, an LLM-guided, end-to-end framework for synthesizing reusable EqSat strategies. At its core, EggMind introduces a domain-specific language, EqSatL, to represent EqSat strategies as explicit and inspectable artifacts. It then proposes an LLM-guided agentic workflow, equipped with novel techniques including proof-derived rewrite motif caching and tractability guidance, to search efficiently for high-quality strategies while keeping synthesis stable under e-graph growth. Evaluation shows that EggMind substantially improves the resource-quality trade-off on vectorization benchmarks, reducing final cost by 45.1% and peak RAM by 69.1% relative to full EqSat. We further show that the same methodology transfers effectively to an XLA-based tensor compiler, and demonstrate its practical potential in a logic-synthesis case study with augmented rewrite spaces.
PLNov 19, 2025
SkyEgg: Joint Implementation Selection and Scheduling for Hardware Synthesis using E-graphsYouwei Xiao, Yuyang Zou, Yun Liang
Hardware synthesis from high-level descriptions remains fundamentally limited by the sequential optimization of interdependent design decisions. Current methodologies, including state-of-the-art high-level synthesis (HLS) tools, artificially separate implementation selection from scheduling, leading to suboptimal designs that cannot fully exploit modern FPGA heterogeneous architectures. Implementation selection is typically performed by ad-hoc pattern matching on operations, a process that does not consider the impact on scheduling. Subsequently, scheduling algorithms operate on fixed selection solutions with inaccurate delay estimates, which misses critical optimization opportunities from appropriately configured FPGA blocks like DSP slices. We present SkyEgg, a novel hardware synthesis framework that jointly optimizes implementation selection and scheduling using the e-graph data structure. Our key insight is that both algebraic transformations and hardware implementation choices can be uniformly represented as rewrite rules within an e-graph, modeling the complete design space of implementation candidates to be selected and scheduled together. First, SkyEgg constructs an e-graph from the input program. It then applies both algebraic and implementation rewrites through equality saturation. Finally, it formulates the joint optimization as a mixed-integer linear programming (MILP) problem on the saturated e-graph. We provide both exact MILP solving and an efficient ASAP heuristic for scalable synthesis. Our evaluation on benchmarks from diverse applications targeting Xilinx Kintex UltraScale+ FPGAs demonstrates that SkyEgg achieves an average speedup of 3.01x over Vitis HLS, with improvements up to 5.22x for complex expressions.
CVJun 13, 2019
The Replica Dataset: A Digital Replica of Indoor SpacesJulian Straub, Thomas Whelan, Lingni Ma et al.
We introduce Replica, a dataset of 18 highly photo-realistic 3D indoor scene reconstructions at room and building scale. Each scene consists of a dense mesh, high-resolution high-dynamic-range (HDR) textures, per-primitive semantic class and instance information, and planar mirror and glass reflectors. The goal of Replica is to enable machine learning (ML) research that relies on visually, geometrically, and semantically realistic generative models of the world - for instance, egocentric computer vision, semantic segmentation in 2D and 3D, geometric inference, and the development of embodied agents (virtual robots) performing navigation, instruction following, and question answering. Due to the high level of realism of the renderings from Replica, there is hope that ML systems trained on Replica may transfer directly to real world image and video data. Together with the data, we are releasing a minimal C++ SDK as a starting point for working with the Replica dataset. In addition, Replica is `Habitat-compatible', i.e. can be natively used with AI Habitat for training and testing embodied agents.