Age-Aware Edge-Blind Federated Learning via Over-the-Air Aggregation

arXiv:2602.02469v1h-index: 13

AI Analysis

This work addresses latency and efficiency issues in wireless federated learning for edge computing applications, but it is incremental as it builds on existing over-the-air aggregation methods.

The paper tackles the problem of federated learning over wireless fading channels with limited orthogonal subcarriers by proposing an age-aware edge-blind approach that selects a subset of model coordinates to reduce latency, and experimental results show that more antennas improve accuracy and convergence speed, with AgeTop-k outperforming random selection under good channel conditions.

We study federated learning (FL) over wireless fading channels where multiple devices simultaneously send their model updates. We propose an efficient \emph{age-aware edge-blind over-the-air FL} approach that does not require channel state information (CSI) at the devices. Instead, the parameter server (PS) uses multiple antennas and applies maximum-ratio combining (MRC) based on its estimated sum of the channel gains to detect the parameter updates. A key challenge is that the number of orthogonal subcarriers is limited; thus, transmitting many parameters requires multiple Orthogonal Frequency Division Multiplexing (OFDM) symbols, which increases latency. To address this, the PS selects only a small subset of model coordinates each round using \emph{AgeTop-\(k\)}, which first picks the largest-magnitude entries and then chooses the \(k\) coordinates with the longest waiting times since they were last selected. This ensures that all selected parameters fit into a single OFDM symbol, reducing latency. We provide a convergence bound that highlights the advantages of using a higher number of antenna array elements and demonstrates a key trade-off: increasing \(k\) decreases compression error at the cost of increasing the effect of channel noise. Experimental results show that (i) more PS antennas greatly improve accuracy and convergence speed; (ii) AgeTop-\(k\) outperforms random selection under relatively good channel conditions; and (iii) the optimum \(k\) depends on the channel, with smaller \(k\) being better in noisy settings.

View on arXiv PDF

Similar