14.4HEApr 9
High-dimensional inference for the $γ$-ray sky with differentiable programmingSiddharth Mishra-Sharma, Tracy R. Slatyer, Yitian Sun et al.
We motivate the use of differentiable probabilistic programming techniques in order to account for the large model-space inherent to astrophysical $γ$-ray analyses. Targeting the longstanding Galactic Center $γ$-ray Excess (GCE) puzzle, we construct differentiable forward model and likelihood that make liberal use of GPU acceleration and vectorization in order to simultaneously account for a continuum of possible spatial morphologies consistent with the GCE emission in a fully probabilistic manner. Our setup allows for efficient inference over the large model space using variational methods. Beyond application to $γ$-ray data, a goal of this work is to showcase how differentiable probabilistic programming can be used as a tool to enable flexible analyses of astrophysical datasets.
3.8ARApr 18
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIRYuyang Zou, Youwei Xiao, Chenyun Yin et al.
Application-Specific Instruction-Set Processors (ASIPs) built on the RISC-V architecture offer specialization opportunities for various applications. Existing frameworks are largely designed around fixed instruction extension interfaces and rely on manual software adaptation. However, as emerging domains scale up in complexity, two major challenges arise. First, memory access remains a primary bottleneck as existing design flows lack architectural awareness of memory interfaces, leading to suboptimal interface selection and orchestration. Second, the semantic complexity of custom instruction extensions, characterized by non-trivial control logic and irregular memory behaviors, hinders the ability of conventional compilers to perform automated and comprehensive offloading. We present Aquas, a holistic hardware-software co-design framework built upon MLIR. Aquas proposes a memory interface model that jointly considers interface characteristics and cache effects, along with an interface-aware synthesis flow guided by this model that progressively optimizes the input specification and generates efficient hardware implementations. We also propose an e-graph-based retargetable compiler approach with a novel matching engine for efficient instruction mapping and offloading, enabling robust and effective utilization of custom instruction capabilities. Case studies across four diverse domains show that Aquas delivers substantial acceleration, achieving up to 15.61x speedup with 14.5% area overhead and zero frequency degradation, proving highly competitive in domain acceleration against more powerful general-purpose cores and vector extensions.