Hardware-Aware Federated Learning for Speech Emotion Recognition
For practitioners deploying federated learning on heterogeneous edge devices, this work offers a method to reduce training time and communication cost while maintaining accuracy.
This paper proposes a hardware-aware federated learning framework for speech emotion recognition that integrates hardware profiling, top-K client selection, and adaptive local epochs. The method achieves competitive validation accuracy (0.352), reduces training time by 36.5%, and lowers communication cost by 40% compared to FedAvg.
Federated learning (FL) enables privacy-preserving collaborative training across distributed edge devices, but real deployments involve heterogeneous clients with different processing power, memory capacity, and communication latency, which often increase round duration and system cost. This paper proposes a hardware-aware federated learning framework for emotion recognition on session-partitioned IEMOCAP that integrates hardware profiling, top-K client selection, and adaptive local epochs within a unified training loop. We compare the method against FedAvg, FedProx, and random top-K selection under a non-IID setup and show that, across 50 federated rounds and 5 independent trials, the proposed approach achieves competitive validation accuracy (0.352), reduces total training time by about 36.5% compared to FedAvg, and lowers cumulative communication cost by 40%.