LGNISep 23, 2021

Federated Feature Selection for Cyber-Physical Systems of Systems

arXiv:2109.11323v241 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient data processing for autonomous vehicles in cyber-physical systems, offering a distributed solution to reduce resource usage, though it is incremental as it builds on existing feature selection and federated learning concepts.

The paper tackles the problem of selecting informative features from multi-modal data in autonomous vehicles to reduce computation and communication workloads at the Edge, proposing a federated feature selection algorithm that converges to a consensus on minimal feature subsets, achieving reductions such as 24 out of 2166 features (99%) in one dataset and 4 out of 8 (50%) in another while preserving data informativeness.

Autonomous vehicles (AVs) generate a massive amount of multi-modal data that once collected and processed through Machine Learning algorithms, enable AI-based services at the Edge. In fact, not all these data contain valuable, and informative content but only a subset of the relative attributes should be exploited at the Edge. Therefore, enabling AVs to locally extract such a subset is of utmost importance to limit computation and communication workloads. Achieving a consistent subset of data in a distributed manner imposes the AVs to cooperate in finding an agreement on what attributes should be sent to the Edge. In this work, we address such a problem by proposing a federated feature selection algorithm where all the AVs collaborate to filter out, iteratively, the redundant or irrelevant attributes in a distributed manner, without any exchange of raw data. This solution builds on two components: a Mutual-Information-based feature selection algorithm run by the AVs and a novel aggregation function based on the Bayes theorem executed on the Edge. Our federated feature selection algorithm provably converges to a solution in a finite number of steps. Such an algorithm has been tested on two reference datasets: MAV with images and inertial measurements of a monitored vehicle, WESAD with a collection of samples from biophysical sensors to monitor a relative passenger. The numerical results show that the fleet finds a consensus with both the datasets on the minimum achievable subset of features, i.e., 24 out of 2166 (99\%) in MAV and 4 out of 8 (50\%) in WESAD, preserving the informative content of data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes