CL AI DB LGMay 26, 2023

Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms

Tianshu Zhang, Changchang Liu, Wei-Han Lee, Yu Su, Huan Sun

arXiv:2305.17221v126.4224 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of training data-hungry neural semantic parsers for clients with limited data in a federated setting, though it is incremental as it builds on existing FL algorithms.

The paper tackles the problem of federated learning for semantic parsing by proposing a new evaluation setup and a novel re-weighting mechanism (Lorar) to address client heterogeneity, resulting in performance improvements of 4%-20% on average and faster convergence for clients.

This paper studies a new task of federated learning (FL) for semantic parsing, where multiple clients collaboratively train one global model without sharing their semantic parsing data. By leveraging data from multiple clients, the FL paradigm can be especially beneficial for clients that have little training data to develop a data-hungry neural semantic parser on their own. We propose an evaluation setup to study this task, where we re-purpose widely-used single-domain text-to-SQL datasets as clients to form a realistic heterogeneous FL setting and collaboratively train a global model. As standard FL algorithms suffer from the high client heterogeneity in our realistic setup, we further propose a novel LOss Reduction Adjusted Re-weighting (Lorar) mechanism to mitigate the performance degradation, which adjusts each client's contribution to the global model update based on its training loss reduction during each round. Our intuition is that the larger the loss reduction, the further away the current global model is from the client's local optimum, and the larger weight the client should get. By applying Lorar to three widely adopted FL algorithms (FedAvg, FedOPT and FedProx), we observe that their performance can be improved substantially on average (4%-20% absolute gain under MacroAvg) and that clients with smaller datasets enjoy larger performance gains. In addition, the global model converges faster for almost all the clients.

View on arXiv PDF Code

Similar