CVNINov 30, 2021

Large-Scale Video Analytics through Object-Level Consolidation

arXiv:2111.15451v11 citations
Originality Incremental advance
AI Analysis

This addresses the problem of high compute costs and latency for service providers in applications like smart cities and autonomous driving, offering a novel optimization for multi-camera deployments.

The paper tackles the challenge of scaling video analytics across distributed, resource-constrained cameras by introducing FoMO, a method that consolidates object-level data from multiple cameras to reduce compute requirements. Results show an 8x increase in system performance and a 40% improvement in accuracy using an off-the-shelf model without additional training.

As the number of installed cameras grows, so do the compute resources required to process and analyze all the images captured by these cameras. Video analytics enables new use cases, such as smart cities or autonomous driving. At the same time, it urges service providers to install additional compute resources to cope with the demand while the strict latency requirements push compute towards the end of the network, forming a geographically distributed and heterogeneous set of compute locations, shared and resource-constrained. Such landscape (shared and distributed locations) forces us to design new techniques that can optimize and distribute work among all available locations and, ideally, make compute requirements grow sublinearly with respect to the number of cameras installed. In this paper, we present FoMO (Focus on Moving Objects). This method effectively optimizes multi-camera deployments by preprocessing images for scenes, filtering the empty regions out, and composing regions of interest from multiple cameras into a single image that serves as input for a pre-trained object detection model. Results show that overall system performance can be increased by 8x while accuracy improves 40% as a by-product of the methodology, all using an off-the-shelf pre-trained model with no additional training or fine-tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes