CVDec 11, 2024Code
Benchmarking Federated Learning for Semantic Datasets: Federated Scene Graph GenerationSeungBum Ha, Taehwan Lee, Jiyoun Lim et al.
Federated learning (FL) enables decentralized training while preserving data privacy, yet existing FL benchmarks address relatively simple classification tasks, where each sample is annotated with a one-hot label. However, little attention has been paid to demonstrating an FL benchmark that handles complicated semantics, where each sample encompasses diverse semantic information, such as relations between objects. Because the existing benchmarks are designed to distribute data in a narrow view of a single semantic, managing the complicated semantic heterogeneity across clients when formalizing FL benchmarks is non-trivial. In this paper, we propose a benchmark process to establish an FL benchmark with controllable semantic heterogeneity across clients: two key steps are (i) data clustering with semantics and (ii) data distributing via controllable semantic heterogeneity across clients. As a proof of concept, we construct a federated PSG benchmark, demonstrating the efficacy of the existing PSG methods in an FL setting with controllable semantic heterogeneity of scene graphs. We also present the effectiveness of our benchmark by applying robust federated learning algorithms to data heterogeneity to show increased performance. To our knowledge, this is the first benchmark framework that enables federated learning and its evaluation for multi-semantic vision tasks under the controlled semantic heterogeneity. Our code is available at https://github.com/Seung-B/FL-PSG.
LGJun 2, 2025
Unlearning's Blind Spots: Over-Unlearning and Prototypical Relearning AttackSeungBum Ha, Saerom Park, Sung Whan Yoon
Machine unlearning (MU) aims to expunge a designated forget set from a trained model without costly retraining, yet the existing techniques overlook two critical blind spots: "over-unlearning" that deteriorates retained data near the forget set, and post-hoc "relearning" attacks that aim to resurrect the forgotten knowledge. We first derive the over-unlearning metric OU@ε, which represents the collateral damage to the nearby region of the forget set, where the over-unlearning mainly appears. Next, we expose an unforeseen relearning threat on MU, i.e., the Prototypical Relearning Attack, which exploits the per-class prototype of the forget class with just a few samples, and easily restores the pre-unlearning performance. To counter both blind spots, we introduce Spotter, a plug-and-play objective that combines (i) a masked knowledge-distillation penalty on the nearby region of forget set to suppress OU@ε, and (ii) an intra-class dispersion loss that scatters forget-class embeddings, neutralizing prototypical relearning attacks. On CIFAR-10, as one of validations, Spotter reduces OU@εby below the 0.05X of the baseline, drives forget accuracy to 0%, preserves accuracy of the retain set within 1% of difference with the original, and denies the prototype-attack by keeping the forget set accuracy within <1%, without accessing retained data. It confirms that Spotter is a practical remedy of the unlearning's blind spots.