Jinu Gong

LG
h-index9
15papers
93citations
Novelty39%
AI Score47

15 Papers

LGSep 14, 2022
Compressed Particle-Based Federated Bayesian Learning and Unlearning

Jinu Gong, Osvaldo Simeone, Joonhyuk Kang

Conventional frequentist FL schemes are known to yield overconfident decisions. Bayesian FL addresses this issue by allowing agents to process and exchange uncertainty information encoded in distributions over the model parameters. However, this comes at the cost of a larger per-iteration communication overhead. This letter investigates whether Bayesian FL can still provide advantages in terms of calibration when constraining communication bandwidth. We present compressed particle-based Bayesian FL protocols for FL and federated "unlearning" that apply quantization and sparsification across multiple particles. The experimental results confirm that the benefits of Bayesian FL are robust to bandwidth constraints.

LGJan 30
Beyond Fixed Rounds: Data-Free Early Stopping for Practical Federated Learning

Youngjoon Lee, Hyukjoon Lee, Seungrok Jung et al.

Federated Learning (FL) facilitates decentralized collaborative learning without transmitting raw data. However, reliance on fixed global rounds or validation data for hyperparameter tuning hinders practical deployment by incurring high computational costs and privacy risks. To address this, we propose a data-free early stopping framework that determines the optimal stopping point by monitoring the task vector's growth rate using solely server-side parameters. The numerical results on skin lesion/blood cell classification demonstrate that our approach is comparable to validation-based early stopping across various state-of-the-art FL methods. In particular, the proposed framework requires an average of 45/12 (skin lesion/blood cell) additional rounds to achieve over 12.3%/8.9% higher performance than early stopping based on validation data. To the best of our knowledge, this is the first work to propose an data-free early stopping framework for FL methods.

LGNov 14, 2025
When to Stop Federated Learning: Zero-Shot Generation of Synthetic Validation Data with Generative AI for Early Stopping

Youngjoon Lee, Hyukjoon Lee, Jinu Gong et al.

Federated Learning (FL) enables collaborative model training across decentralized devices while preserving data privacy. However, FL methods typically run for a predefined number of global rounds, often leading to unnecessary computation when optimal performance is reached earlier. In addition, training may continue even when the model fails to achieve meaningful performance. To address this inefficiency, we introduce a zero-shot synthetic validation framework that leverages generative AI to monitor model performance and determine early stopping points. Our approach adaptively stops training near the optimal round, thereby conserving computational resources and enabling rapid hyperparameter adjustments. Numerical results on multi-label chest X-ray classification demonstrate that our method reduces training rounds by up to 74% while maintaining accuracy within 1% of the optimal.

LGNov 3, 2025
CG-FKAN: Compressed-Grid Federated Kolmogorov-Arnold Networks for Communication Constrained Environment

Seunghun Yu, Youngjoon Lee, Jinu Gong et al.

Federated learning (FL), widely used in privacy-critical applications, suffers from limited interpretability, whereas Kolmogorov-Arnold Networks (KAN) address this limitation via learnable spline functions. However, existing FL studies applying KAN overlook the communication overhead introduced by grid extension, which is essential for modeling complex functions. In this letter, we propose CG-FKAN, which compresses extended grids by sparsifying and transmitting only essential coefficients under a communication budget. Experiments show that CG-FKAN achieves up to 13.6% lower RMSE than fixed-grid KAN in communication-constrained settings. In addition, we derive a theoretical upper bound on its approximation error.

LGOct 6, 2025Code
Forecasting-Based Biomedical Time-series Data Synthesis for Open Data and Robust AI

Youngjoon Lee, Seongmin Cho, Yehhyun Jo et al.

The limited data availability due to strict privacy regulations and significant resource demands severely constrains biomedical time-series AI development, which creates a critical gap between data requirements and accessibility. Synthetic data generation presents a promising solution by producing artificial datasets that maintain the statistical properties of real biomedical time-series data without compromising patient confidentiality. We propose a framework for synthetic biomedical time-series data generation based on advanced forecasting models that accurately replicates complex electrophysiological signals such as EEG and EMG with high fidelity. These synthetic datasets preserve essential temporal and spectral properties of real data, which enables robust analysis while effectively addressing data scarcity and privacy challenges. Our evaluations across multiple subjects demonstrate that the generated synthetic data can serve as an effective substitute for real data and also significantly boost AI model performance. The approach maintains critical biomedical features while provides high scalability for various applications and integrates seamlessly into open-source repositories, substantially expanding resources for AI-driven biomedical research.

LGJan 30, 2025
Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation

Youngjoon Lee, Taehyun Park, Yunho Lee et al.

Federated Learning (FL) is increasingly being adopted in military collaborations to develop Large Language Models (LLMs) while preserving data sovereignty. However, prompt injection attacks-malicious manipulations of input prompts-pose new threats that may undermine operational security, disrupt decision-making, and erode trust among allies. This perspective paper highlights four potential vulnerabilities in federated military LLMs: secret data leakage, free-rider exploitation, system disruption, and misinformation spread. To address these potential risks, we propose a human-AI collaborative framework that introduces both technical and policy countermeasures. On the technical side, our framework uses red/blue team wargaming and quality assurance to detect and mitigate adversarial behaviors of shared LLM weights. On the policy side, it promotes joint AI-human policy development and verification of security protocols. Our findings will guide future research and emphasize proactive strategies for emerging military contexts.

LGFeb 27, 2025
Revisit the Stability of Vanilla Federated Learning Under Diverse Conditions

Youngjoon Lee, Jinu Gong, Sun Choi et al.

Federated Learning (FL) is a distributed machine learning paradigm enabling collaborative model training across decentralized clients while preserving data privacy. In this paper, we revisit the stability of the vanilla FedAvg algorithm under diverse conditions. Despite its conceptual simplicity, FedAvg exhibits remarkably stable performance compared to more advanced FL techniques. Our experiments assess the performance of various FL methods on blood cell and skin lesion classification tasks using Vision Transformer (ViT). Additionally, we evaluate the impact of different representative classification models and analyze sensitivity to hyperparameter variations. The results consistently demonstrate that, regardless of dataset, classification model employed, or hyperparameter settings, FedAvg maintains robust performance. Given its stability, robust performance without the need for extensive hyperparameter tuning, FedAvg is a safe and efficient choice for FL deployments in resource-constrained hospitals handling medical data. These findings underscore the enduring value of the vanilla FedAvg approach as a trusted baseline for clinical practice.

LGApr 28, 2025
A Unified Benchmark of Federated Learning with Kolmogorov-Arnold Networks for Medical Imaging

Youngjoon Lee, Jinu Gong, Joonhyuk Kang

Federated Learning (FL) enables model training across decentralized devices without sharing raw data, thereby preserving privacy in sensitive domains like healthcare. In this paper, we evaluate Kolmogorov-Arnold Networks (KAN) architectures against traditional MLP across six state-of-the-art FL algorithms on a blood cell classification dataset. Notably, our experiments demonstrate that KAN can effectively replace MLP in federated environments, achieving superior performance with simpler architectures. Furthermore, we analyze the impact of key hyperparameters-grid size and network architecture-on KAN performance under varying degrees of Non-IID data distribution. In addition, our ablation studies reveal that optimizing KAN width while maintaining minimal depth yields the best performance in federated settings. As a result, these findings establish KAN as a promising alternative for privacy-preserving medical imaging applications in distributed healthcare. To the best of our knowledge, this is the first comprehensive benchmark of KAN in FL settings for medical imaging task.

LGJul 26, 2025
Debunking Optimization Myths in Federated Learning for Medical Image Classification

Youngjoon Lee, Hyukjoon Lee, Jinu Gong et al.

Federated Learning (FL) is a collaborative learning method that enables decentralized model training while preserving data privacy. Despite its promise in medical imaging, recent FL methods are often sensitive to local factors such as optimizers and learning rates, limiting their robustness in practical deployments. In this work, we revisit vanilla FL to clarify the impact of edge device configurations, benchmarking recent FL methods on colorectal pathology and blood cell classification task. We numerically show that the choice of local optimizer and learning rate has a greater effect on performance than the specific FL method. Moreover, we find that increasing local training epochs can either enhance or impair convergence, depending on the FL method. These findings indicate that appropriate edge-specific configuration is more crucial than algorithmic complexity for achieving effective FL.

LGAug 12, 2025
Resource-Aware Aggregation and Sparsification in Heterogeneous Ensemble Federated Learning

Keumseo Ryum, Jinu Gong, Joonhyuk Kang

Federated learning (FL) enables distributed training with private client data, but its convergence is hindered by system heterogeneity under realistic communication scenarios. Most FL schemes addressing system heterogeneity utilize global pruning or ensemble distillation, yet often overlook typical constraints required for communication efficiency. Meanwhile, deep ensembles can aggregate predictions from individually trained models to improve performance, but current ensemble-based FL methods fall short in fully capturing diversity of model predictions. In this work, we propose \textbf{SHEFL}, a global ensemble-based FL framework suited for clients with diverse computational capacities. We allocate different numbers of global models to clients based on their available resources. We introduce a novel aggregation scheme that mitigates the training bias between clients and dynamically adjusts the sparsification ratio across clients to reduce the computational burden of training deep ensembles. Extensive experiments demonstrate that our method effectively addresses computational heterogeneity, significantly improving accuracy and stability compared to existing approaches.

LGMay 9, 2025
Improving Generalizability of Kolmogorov-Arnold Networks via Error-Correcting Output Codes

Youngjoon Lee, Jinu Gong, Joonhyuk Kang

Kolmogorov-Arnold Networks (KAN) offer universal function approximation using univariate spline compositions without nonlinear activations. In this work, we integrate Error-Correcting Output Codes (ECOC) into the KAN framework to transform multi-class classification into multiple binary tasks, improving robustness via Hamming distance decoding. Our proposed KAN with ECOC framework outperforms vanilla KAN on a challenging blood cell classification dataset, achieving higher accuracy across diverse hyperparameter settings. Ablation studies further confirm that ECOC consistently enhances performance across FastKAN and FasterKAN variants. These results demonstrate that ECOC integration significantly boosts KAN generalizability in critical healthcare AI applications. To the best of our knowledge, this is the first work of ECOC with KAN for enhancing multi-class medical image classification performance.

LGNov 15, 2024
Embedding Byzantine Fault Tolerance into Federated Learning via Consistency Scoring

Youngjoon Lee, Jinu Gong, Joonhyuk Kang

Given sufficient data from multiple edge devices, federated learning (FL) enables training a shared model without transmitting private data to the central server. However, FL is generally vulnerable to Byzantine attacks from compromised edge devices, which can significantly degrade the model performance. In this work, we propose an intuitive plugin that seamlessly embeds Byzantine resilience into existing FL methods. The key idea is to generate virtual data samples and evaluate model consistency scores across local updates to effectively filter out compromised updates. By utilizing this scoring mechanism before the aggregation phase, the proposed plugin enables existing FL methods to become robust against Byzantine attacks while maintaining their original benefits. Numerical results on blood cell classification task demonstrate that the proposed plugin provides strong Byzantine resilience. In detail, plugin-attached FedAvg achieves over 89.6% test accuracy under 30% targeted attacks (vs.19.5% w/o plugin) and maintains 65-70% test accuracy under untargeted attacks (vs.17-19% w/o plugin).

LGOct 31, 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks

Youngjoon Lee, Jinu Gong, Joonhyuk Kang

Federated learning enables edge devices to collaboratively train a global model while maintaining data privacy by keeping data localized. However, the Non-IID nature of data distribution across devices often hinders model convergence and reduces performance. In this paper, we propose a novel plugin for federated optimization techniques that approximates Non-IID data distributions to IID through generative AI-enhanced data augmentation and balanced sampling strategy. Key idea is to synthesize additional data for underrepresented classes on each edge device, leveraging generative AI to create a more balanced dataset across the FL network. Additionally, a balanced sampling approach at the central server selectively includes only the most IID-like devices, accelerating convergence while maximizing the global model's performance. Experimental results validate that our approach significantly improves convergence speed and robustness against data imbalance, establishing a flexible, privacy-preserving FL plugin that is applicable even in data-scarce environments.

LGNov 23, 2021
Forget-SVGD: Particle-Based Bayesian Federated Unlearning

Jinu Gong, Osvaldo Simeone, Rahif Kassab et al.

Variational particle-based Bayesian learning methods have the advantage of not being limited by the bias affecting more conventional parametric techniques. This paper proposes to leverage the flexibility of non-parametric Bayesian approximate inference to develop a novel Bayesian federated unlearning method, referred to as Forget-Stein Variational Gradient Descent (Forget-SVGD). Forget-SVGD builds on SVGD - a particle-based approximate Bayesian inference scheme using gradient-based deterministic updates - and on its distributed (federated) extension known as Distributed SVGD (DSVGD). Upon the completion of federated learning, as one or more participating agents request for their data to be "forgotten", Forget-SVGD carries out local SVGD updates at the agents whose data need to be "unlearned", which are interleaved with communication rounds with a parameter server. The proposed method is validated via performance comparisons with non-parametric schemes that train from scratch by excluding data to be forgotten, as well as with existing parametric Bayesian unlearning methods.

LGApr 8, 2021
Bayesian Variational Federated Learning and Unlearning in Decentralized Networks

Jinu Gong, Osvaldo Simeone, Joonhyuk Kang

Federated Bayesian learning offers a principled framework for the definition of collaborative training algorithms that are able to quantify epistemic uncertainty and to produce trustworthy decisions. Upon the completion of collaborative training, an agent may decide to exercise her legal "right to be forgotten", which calls for her contribution to the jointly trained model to be deleted and discarded. This paper studies federated learning and unlearning in a decentralized network within a Bayesian framework. It specifically develops federated variational inference (VI) solutions based on the decentralized solution of local free energy minimization problems within exponential-family models and on local gossip-driven communication. The proposed protocols are demonstrated to yield efficient unlearning mechanisms.