Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference
For designers of distributed CPS architectures, this work provides evidence that cloud inference can be a viable and sometimes preferable alternative to on-device inference, potentially reducing local computational demands.
The paper challenges the assumption that cloud-based inference is unsuitable for latency-sensitive control tasks in cyber-physical systems. It develops an analytical model and validates through simulations that, under certain conditions, cloud inference can match or surpass on-device performance for real-time decision-making, specifically in autonomous emergency braking.
The increasing deployment of deep neural networks (DNNs) in cyber-physical systems (CPS) enhances perception fidelity, but imposes substantial computational demands on execution platforms, posing challenges to real-time control deadlines. Traditional distributed CPS architectures typically favor on-device inference to avoid network variability and contention-induced delays on remote platforms. However, this design choice places significant energy and computational demands on the local hardware. In this work, we revisit the assumption that cloud-based inference is intrinsically unsuitable for latency-sensitive control tasks. We demonstrate that, when provisioned with high-throughput compute resources, cloud platforms can effectively amortize network and queueing delays, enabling them to match or surpass on-device performance for real-time decision-making. Specifically, we develop a formal analytical model that characterizes distributed inference latency as a function of the sensing frequency, platform throughput, network delay, and task-specific safety constraints. We instantiate this model in the context of emergency braking for autonomous driving and validate it through extensive simulations using real-time vehicular dynamics. Our empirical results identify concrete conditions under which cloud-based inference adheres to safety margins more reliably than its on-device counterpart. These findings challenge prevailing design strategies and suggest that the cloud is not merely a feasible option, but often the preferred inference location for distributed CPS architectures. In this light, the cloud is not as distant as traditionally perceived; in fact, it is closer than it appears.