Training Speech Recognition Models with Federated Learning: A Quality/Cost Framework
This addresses the challenge of non-IID data in federated learning for speech recognition, offering a practical solution for decentralized training, though it appears incremental as it builds on existing federated learning methods.
The paper tackles the problem of training speech recognition models with federated learning by proposing a framework to vary non-IID data distributions, showing a trade-off between model quality and computational cost, and demonstrating that hyper-parameter optimization and variational noise can compensate for quality impacts while reducing cost.
We propose using federated learning, a decentralized on-device learning paradigm, to train speech recognition models. By performing epochs of training on a per-user basis, federated learning must incur the cost of dealing with non-IID data distributions, which are expected to negatively affect the quality of the trained model. We propose a framework by which the degree of non-IID-ness can be varied, consequently illustrating a trade-off between model quality and the computational cost of federated training, which we capture through a novel metric. Finally, we demonstrate that hyper-parameter optimization and appropriate use of variational noise are sufficient to compensate for the quality impact of non-IID distributions, while decreasing the cost.