DATA-ANJan 29
Comparison of Image Processing Models in Quark Gluon Jet ClassificationDaeun Kim, Jiwon Lee, Wonjun Jeong et al.
We present a comprehensive comparison of convolutional and transformer-based models for distinguishing quark and gluon jets using simulated jet images from Pythia 8. By encoding jet substructure into a three-channel representation of particle kinematics, we evaluate the performance of convolutional neural networks (CNNs), Vision Transformers (ViTs), and Swin Transformers (Swin-Tiny) under both supervised and self-supervised learning setups. Our results show that fine-tuning only the final two transformer blocks of the Swin-Tiny model achieves the best trade-off between efficiency and accuracy, reaching 81.4% accuracy and an AUC (area under the ROC curve) of 88.9%. Self-supervised pretraining with Momentum Contrast (MoCo) further enhances feature robustness and reduces the number of trainable parameters. These findings highlight the potential of hierarchical attention-based models for jet substructure studies and for domain transfer to real collision data.
LGSep 13, 2025
FACTORS: Factorial Approximation for Complementary Two-factor Optimization with Risk-aware ScoringDongseok Kim, Wonjun Jeong, Gisung Oh
We propose FACTORS, a framework that combines design of experiments with Shapley decomposition to address performance and stability issues that are sensitive to combinations of training factors. Our approach consistently estimates main effects and two-factor interactions, then integrates them into a risk-adjusted objective function that jointly accounts for uncertainty and cost, enabling reliable selection of configurations under a fixed budget. Effect estimation is implemented through two complementary paths: a plug-in path based on conditional means, and a least-squares path that reconstructs Shapley contributions from samples. These paths are designed to work complementarily even when design density and bias levels differ. By incorporating standardization of estimates, bias correction, and uncertainty quantification, our procedure ensures comparability across heterogeneous factor spaces and designs, while a lightweight search routine yields configurations within practical time even for large factor spaces. On the theoretical side, we provide error decompositions, sample complexity analysis, and upper bounds on optimality gaps. On the interpretive side, we summarize main effects and interactions in map form, highlighting adjustment priorities and safe improvement pathways. Across diverse datasets and design conditions, our approach improves rank preservation and optimal configuration identification, reduces decision-making risks, and offers a tuning foundation that delivers interpretable justification alongside stable performance gains even under budget constraints.
LGSep 2, 2025
Gaming and Cooperation in Federated Learning: What Can Happen and How to Monitor ItDongseok Kim, Wonjun Jeong, Gisung Oh
The success of Federated Learning depends on the actions that participants take out of sight. We model Federated Learning not as a mere optimization task but as a strategic system entangled with rules and incentives. From this perspective, we present an analytical framework that makes it possible to clearly identify where behaviors that genuinely improve performance diverge from those that merely target metrics. We introduce two indices that respectively quantify behavioral incentives and collective performance loss, and we use them as the basis for consistently interpreting the impact of operational choices such as rule design, the level of information disclosure, evaluation methods, and aggregator switching. We further summarize thresholds, auto-switch rules, and early warning signals into a checklist that can be applied directly in practice, and we provide both a practical algorithm for allocating limited audit resources and a performance guarantee. Simulations conducted across diverse environments consistently validate the patterns predicted by our framework, and we release all procedures for full reproducibility. While our approach operates most strongly under several assumptions, combining periodic recalibration, randomization, and connectivity-based alarms enables robust application under the variability of real-world operations. We present both design principles and operational guidelines that lower the incentives for metric gaming while sustaining and expanding stable cooperation.
LGAug 24, 2025
Convergence and Generalization of Anti-Regularization for Parametric ModelsDongseok Kim, Wonjun Jeong, Gisung Oh
Anti-regularization introduces a reward term with a reversed sign into the loss function, deliberately amplifying model expressivity in small-sample regimes while ensuring that the intervention gradually vanishes as the sample size grows through a power-law decay schedule. We formalize spectral safety conditions and trust-region constraints, and we design a lightweight safeguard that combines a projection operator with gradient clipping to guarantee stable intervention. Theoretical analysis extends to linear smoothers and the Neural Tangent Kernel regime, providing practical guidance on the choice of decay exponents through the balance between empirical risk and variance. Empirical results show that Anti-regularization mitigates underfitting in both regression and classification while preserving generalization and improving calibration. Ablation studies confirm that the decay schedule and safeguards are essential to avoiding overfitting and instability. As an alternative, we also propose a degrees-of-freedom targeting schedule that maintains constant per-sample complexity. Anti-regularization constitutes a simple and reproducible procedure that integrates seamlessly into standard empirical risk minimization pipelines, enabling robust learning under limited data and resource constraints by intervening only when necessary and vanishing otherwise.
CLJul 24, 2025
SCOPE: Stochastic and Counterbiased Option Placement for Evaluating Large Language ModelsWonjun Jeong, Dongseok Kim, Taegkeun Whangbo
Large Language Models (LLMs) can achieve inflated scores on multiple-choice tasks by exploiting inherent biases in option positions or labels, rather than demonstrating genuine understanding. This study introduces SCOPE, an evaluation framework designed to measure and mitigate such selection bias in a dataset-independent manner. By repeatedly invoking a null prompt that lacks semantic content, SCOPE estimates each model's unique position-bias distribution. It then redistributes the answer slot according to the inverse-bias distribution, thereby equalizing the lucky-rate, the probability of selecting the correct answer by chance. Furthermore, it prevents semantically similar distractors from being placed adjacent to the answer, thereby blocking near-miss guesses based on superficial proximity cues. Across multiple benchmark experiments, SCOPE consistently outperformed existing debiasing methods in terms of stable performance improvements and showed clearer confidence distributions over correct options. This framework thus offers a new standard for enhancing the fairness and reliability of LLM evaluations.