DCMay 18
EPIC: Abstraction and Polymorphism of In-Network Collectives on EthernetYitao Yuan, Jianglong Nie, Tianyu Bai et al.
In-Network Collective (INC) acceleration holds immense potential for optimizing AI training and inference; however, its cross-layer nature has historically hindered investment and adoption within the open Ethernet ecosystem. To bridge this gap, we propose EPIC (Ethernet Polymorphic In-network Collective), an INC protocol specification and reference system built on the principle of "Unified Abstraction, Polymorphic Realization." EPIC introduces an abstraction compatible with standard Ethernet that aligns functional boundaries with participant roles, while offering polymorphic realizations tailored to varying hardware capabilities. We address three fundamental challenges: first, we employ a modular design that enables an evolutionary path from simple to complex implementations, allowing vendors to iterate their hardware incrementally; second, we apply formal verification methodologies to prove the correctness of all proposed polymorphic modes; and third, we develop a unified resource management model versatile enough for diverse INC scenarios. Extensive validation -- spanning model checking, packet/flow simulations, VM emulation, Tofino Testbed, and FPGA/RTL verification -- confirms EPIC's correctness, performance gain, and feasibility.
CVDec 8, 2023
SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated VehiclesDeyuan Qu, Qi Chen, Tianyu Bai et al.
Cooperative perception for connected and automated vehicles is traditionally achieved through the fusion of feature maps from two or more vehicles. However, the absence of feature maps shared from other vehicles can lead to a significant decline in 3D object detection performance for cooperative perception models compared to standalone 3D detection models. This drawback impedes the adoption of cooperative perception as vehicle resources are often insufficient to concurrently employ two perception models. To tackle this issue, we present Simultaneous Individual and Cooperative Perception (SiCP), a generic framework that supports a wide range of the state-of-the-art standalone perception backbones and enhances them with a novel Dual-Perception Network (DP-Net) designed to facilitate both individual and cooperative perception. In addition to its lightweight nature with only 0.13M parameters, DP-Net is robust and retains crucial gradient information during feature map fusion. As demonstrated in a comprehensive evaluation on the V2V4Real and OPV2V datasets, thanks to DP-Net, SiCP surpasses state-of-the-art cooperative perception solutions while preserving the performance of standalone perception solutions.
DCApr 21
Verifying In-Network Computing Systems for Design RisksTianyu Bai, Ying Zhang, Xiaoxi Zhang et al.
The emergence of programmable switches has brought in-network computing (INC) into the spotlight in recent years. By offloading computation directly onto the data transmission process, INC improves network utilization, reduces latency to sub-RTT levels, saves link bandwidth, and maintains throughput. However, INC disrupts the transparency of traditional networks, forcing developers to consider network exceptions like packet loss and out-of-order. If not properly handled, these exceptions can lead to violations of application properties, such as cache consistency and lock exclusion. Usual testing cannot exhaustively cover these exceptions, raising doubts about the correctness of INC systems and hindering their deployment in the industry. This paper presents INCGuard, the first general-purpose tool for verifying INC systems. INCGuard provides a high-level specification language and saves developers 67.2% lines of code on average. To help better understand the behavior of the system, INCGuard offers configurable network environments. INCGuard enables developers to express INC-specific correctness properties. INCGuard translates developer-specified systems into state transition representations, performs model checking to detect potential design risks, and reports violation traces to developers. We propose optimizations for INC-specific scenarios to address the challenge of state space explosion. We modeled seven INC systems and identified their risks with INCGuard in seconds. We further reproduce them in real systems to confirm the validity of our verification result.
CVJul 18, 2025
GRAM-MAMBA: Holistic Feature Alignment for Wireless Perception with Adaptive Low-Rank CompensationWeiqi Yang, Xu Zhou, Jingfu Guan et al.
Multi-modal fusion is crucial for Internet of Things (IoT) perception, widely deployed in smart homes, intelligent transport, industrial automation, and healthcare. However, existing systems often face challenges: high model complexity hinders deployment in resource-constrained environments, unidirectional modal alignment neglects inter-modal relationships, and robustness suffers when sensor data is missing. These issues impede efficient and robust multimodal perception in real-world IoT settings. To overcome these limitations, we propose GRAM-MAMBA. This framework utilizes the linear-complexity Mamba model for efficient sensor time-series processing, combined with an optimized GRAM matrix strategy for pairwise alignment among modalities, addressing the shortcomings of traditional single-modality alignment. Inspired by Low-Rank Adaptation (LoRA), we introduce an adaptive low-rank layer compensation strategy to handle missing modalities post-training. This strategy freezes the pre-trained model core and irrelevant adaptive layers, fine-tuning only those related to available modalities and the fusion process. Extensive experiments validate GRAM-MAMBA's effectiveness. On the SPAWC2021 indoor positioning dataset, the pre-trained model shows lower error than baselines; adapting to missing modalities yields a 24.5% performance boost by training less than 0.2% of parameters. On the USC-HAD human activity recognition dataset, it achieves 93.55% F1 and 93.81% Overall Accuracy (OA), outperforming prior work; the update strategy increases F1 by 23% while training less than 0.3% of parameters. These results highlight GRAM-MAMBA's potential for achieving efficient and robust multimodal perception in resource-constrained environments.