Daeyoung Choi

LG
6papers
9citations
Novelty47%
AI Score36

6 Papers

LGMay 23, 2019Code
DEEP-BO for Hyperparameter Optimization of Deep Networks

Hyunghun Cho, Yongjin Kim, Eunjung Lee et al.

The performance of deep neural networks (DNN) is very sensitive to the particular choice of hyper-parameters. To make it worse, the shape of the learning curve can be significantly affected when a technique like batchnorm is used. As a result, hyperparameter optimization of deep networks can be much more challenging than traditional machine learning models. In this work, we start from well known Bayesian Optimization solutions and provide enhancement strategies specifically designed for hyperparameter optimization of deep networks. The resulting algorithm is named as DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO easily outperforms or shows comparable performance with some of the well-known solutions including GP-Hedge, Hyperband, BOHB, Median Stopping Rule, and Learning Curve Extrapolation. The code used is made publicly available at https://github.com/snu-adsl/DEEP-BO.

LGJul 6, 2025
TinyProto: Communication-Efficient Federated Learning with Sparse Prototypes in Resource-Constrained Environments

Gyuejeong Lee, Daeyoung Choi

Communication efficiency in federated learning (FL) remains a critical challenge for resource-constrained environments. While prototype-based FL reduces communication overhead by sharing class prototypes-mean activations in the penultimate layer-instead of model parameters, its efficiency decreases with larger feature dimensions and class counts. We propose TinyProto, which addresses these limitations through Class-wise Prototype Sparsification (CPS) and adaptive prototype scaling. CPS enables structured sparsity by allocating specific dimensions to class prototypes and transmitting only non-zero elements, while adaptive scaling adjusts prototypes based on class distributions. Our experiments show TinyProto reduces communication costs by up to 4x compared to existing methods while maintaining performance. Beyond its communication efficiency, TinyProto offers crucial advantages: achieving compression without client-side computational overhead and supporting heterogeneous architectures, making it ideal for resource-constrained heterogeneous FL.

LGJul 6, 2025
Heterogeneous Federated Learning with Prototype Alignment and Upscaling

Gyuejeong Lee, Jihwan Shin, Daeyoung Choi

Heterogeneity in data distributions and model architectures remains a significant challenge in federated learning (FL). Various heterogeneous FL (HtFL) approaches have recently been proposed to address this challenge. Among them, prototype-based FL (PBFL) has emerged as a practical framework that only shares per-class mean activations from the penultimate layer. However, PBFL approaches often suffer from suboptimal prototype separation, limiting their discriminative power. We propose Prototype Normalization (ProtoNorm), a novel PBFL framework that addresses this limitation through two key components: Prototype Alignment (PA) and Prototype Upscaling (PU). The PA method draws inspiration from the Thomson problem in classical physics, optimizing global prototype configurations on a unit sphere to maximize angular separation; subsequently, the PU method increases prototype magnitudes to enhance separation in Euclidean space. Extensive evaluations on benchmark datasets show that our approach better separates prototypes and thus consistently outperforms existing HtFL approaches. Notably, since ProtoNorm inherits the communication efficiency of PBFL and the PA is performed server-side, it is particularly suitable for resource-constrained environments.

LGJun 12, 2024
Class-Wise Federated Averaging for Efficient Personalization

Gyuejeong Lee, Daeyoung Choi

Federated learning (FL) enables collaborative model training across distributed clients without centralizing data. However, existing approaches such as Federated Averaging (FedAvg) often perform poorly with heterogeneous data distributions, failing to achieve personalization owing to their inability to capture class-specific information effectively. We propose Class-wise Federated Averaging (cwFedAvg), a novel personalized FL (PFL) framework that performs Federated Averaging for each class, to overcome the personalization limitations of FedAvg. cwFedAvg creates class-specific global models via weighted aggregation of local models using class distributions, and subsequently combines them to generate personalized local models. We further propose Weight Distribution Regularizer (WDR), which encourages deep networks to encode class-specific information efficiently by aligning empirical and approximated class distributions derived from output layer weights, to facilitate effective class-wise aggregation. Our experiments demonstrate the superior performance of cwFedAvg with WDR over existing PFL methods through efficient personalization while maintaining the communication cost of FedAvg and avoiding additional local training and pairwise computations.

LGNov 8, 2018
Statistical Characteristics of Deep Representations: An Empirical Investigation

Daeyoung Choi, Kyungeun Lee, Duhun Hwang et al.

In this study, the effects of eight representation regularization methods are investigated, including two newly developed rank regularizers (RR). The investigation shows that the statistical characteristics of representations such as correlation, sparsity, and rank can be manipulated as intended, during training. Furthermore, it is possible to improve the baseline performance simply by trying all the representation regularizers and fine-tuning the strength of their effects. In contrast to performance improvement, no consistent relationship between performance and statistical characteristics was observable. The results indicate that manipulation of statistical characteristics can be helpful for improving performance, but only indirectly through its influence on learning dynamics or its tuning effects.

LGSep 25, 2018
Utilizing Class Information for Deep Network Representation Shaping

Daeyoung Choi, Wonjong Rhee

Statistical characteristics of deep network representations, such as sparsity and correlation, are known to be relevant to the performance and interpretability of deep learning. When a statistical characteristic is desired, often an adequate regularizer can be designed and applied during the training phase. Typically, such a regularizer aims to manipulate a statistical characteristic over all classes together. For classification tasks, however, it might be advantageous to enforce the desired characteristic per class such that different classes can be better distinguished. Motivated by the idea, we design two class-wise regularizers that explicitly utilize class information: class-wise Covariance Regularizer (cw-CR) and class-wise Variance Regularizer (cw-VR). cw-CR targets to reduce the covariance of representations calculated from the same class samples for encouraging feature independence. cw-VR is similar, but variance instead of covariance is targeted to improve feature compactness. For the sake of completeness, their counterparts without using class information, Covariance Regularizer (CR) and Variance Regularizer (VR), are considered together. The four regularizers are conceptually simple and computationally very efficient, and the visualization shows that the regularizers indeed perform distinct representation shaping. In terms of classification performance, significant improvements over the baseline and L1/L2 weight regularization methods were found for 21 out of 22 tasks over popular benchmark datasets. In particular, cw-VR achieved the best performance for 13 tasks including ResNet-32/110.