CAFL-L: Constraint-Aware Federated Learning with Lagrangian Dual Optimization for On-Device Language Models
This work addresses the challenge of deploying language models on edge devices with limited resources, representing an incremental improvement over standard federated learning methods.
The paper tackled the problem of training language models on resource-constrained edge devices by introducing CAFL-L, a federated learning method that incorporates device-level constraints like energy and memory, resulting in a 20% reduction in memory usage and 95% reduction in communication while maintaining competitive validation performance.
We introduce Constraint-Aware Federated Learning with Lagrangian Dual Optimization (CAFL-L), a principled extension of FedAvg that explicitly incorporates device-level resource constraints including energy, communication, memory, and thermal budgets. CAFL-L employs Lagrangian dual optimization to dynamically adapt training hyperparameters -- freezing depth, local steps, batch size, and communication compression -- while preserving training stability through token-budget preservation via gradient accumulation. Experiments on a character-level language model demonstrate that CAFL-L achieves superior constraint satisfaction compared to standard FedAvg (reducing memory usage by 20% and communication by 95%) while maintaining competitive validation performance, making it practical for deployment on resource-constrained edge devices.