CLAug 7, 2023

Adapter-based Selective Knowledge Distillation for Federated Multi-domain Meeting Summarization

arXiv:2308.03275v17 citationsh-index: 47
Originality Incremental advance
AI Analysis

It addresses the problem of training summarization models on sensitive, distributed meeting data for users in real-world scenarios, but is incremental as it builds on existing federated learning and adapter techniques.

The paper tackles federated learning for meeting summarization by addressing bandwidth costs and non-IID data challenges, proposing AdaFedSelecKD which achieves comparable performance to centralized methods on the QMSum benchmark.

Meeting summarization has emerged as a promising technique for providing users with condensed summaries. However, existing work has focused on training models on centralized data, neglecting real-world scenarios where meeting data are infeasible to collect centrally, due to their sensitive nature. This gap motivates us to explore federated learning for meeting summarization. Two critical challenges impede progress. First, state-of-the-art summarizers are based on parameter-heavy pre-trained models. Exchanging such a model's parameters across clients imposes large bandwidth costs. Second, as real-world meeting data belong to various domains and are distributed across clients, they are instances of non-identically and independently distributed (non-IID). IID assumptions do not hold, which changes which forms of learning algorithms best apply. To address this, we propose Adapter-based Federated Selective Knowledge Distillation (AdaFedSelecKD) for training performant client models. Specifically, we develop an adapter-based summarization model where two adapters cooperatively facilitate learning using fewer parameters to reduce communication costs. Then, we devise a selective knowledge distillation strategy, assisting clients in robustly handling domain-focused modelling on their own data, while leveraging global parameters based on non-IID data. Extensive experiments on the QMSum benchmark demonstrate AdaFedSelecKD can achieve comparable performance with powerful centralized training methods, and shows its generalizability and robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes