66.9CYMay 18Code
Diagnosing Korean-Language LLM Political Bias via Census-Grounded Agent SimulationSungwoo Kang
Large language models (LLMs) exhibit systematic political biases in voter simulations, but their underlying mechanisms and cross-lingual generalizations remain poorly understood. We introduce Dynamo-K, a census-grounded simulation framework evaluating Korean-language LLM political behavior across four models on six Korean elections (2017-2025). Using this framework, we identify three systematic failure modes: (1) progressive bias in moderate agents, where explicit mitigation reduces Mean Absolute Error (MAE) by 5.2 times; (2) model-dependent third-party salience collapse, distinguishing between salience failure and decision bias; and (3) regional polarization collapse, where models bidirectionally under-predict historical party strongholds. To address these failures, we demonstrate that scenario reframing recovers 62% of 2017 MAE by restoring third-party visibility. Furthermore, we introduce a learned reweighting adapter that successfully calibrates opposing-valence models without relying on candidate names at train or test time. Validating our diagnostic framework, Dynamo-K accurately predicts 3/3 presidential winners - including a 2.1%p MAE on the highly contested 0.73%p-margin 2022 race - and correctly identifies the dominant party in a held-out local election. The pipeline is open-source and provides a scalable, cost-effective method for diagnosing LLM political behavior.
LGDec 28, 2025
Multi-Task Learning for Metal Alloy Property Prediction: An Empirical Study of Negative Transfer and Mitigation StrategiesSungwoo Kang
Multi-task learning (MTL) in materials science relies on the assumption that physically related properties share learnable representations. We challenge this assumption using a 54,028-sample metal alloy dataset exhibiting extreme task-level imbalance. Our results reveal a striking dichotomy: MTL significantly degrades regression performance for resistivity and hardness but improves classification recall for amorphous-forming ability. We trace this divergence to mismatched functional forms--such as resistivity's polynomial dependence versus hardness's complex interactions--which cause severe gradient misalignment during optimization. Evaluating Deep Imbalanced Regression techniques, we find that projecting conflicting gradients (PCGrad) recovers single-task performance, while combining label distribution smoothing with gradient normalization achieves the best overall balance. Consequently, we propose a strategic framework: utilize independent models for high-precision characterization, but employ MTL for high-throughput screening where recall is paramount. These findings support a "materials property clustering" hypothesis, suggesting that distinct physical mechanisms require specialized optimization strategies to overcome negative transfer.
STApr 28, 2023
Using a Deep Learning Model to Simulate Human Stock Trader's Methods of Chart AnalysisSungwoo Kang, Jong-Kook Kim
Despite the efficient market hypothesis, many studies suggest the existence of inefficiencies in the stock market leading to the development of techniques to gain above-market returns. Systematic trading has undergone significant advances in recent decades with deep learning schemes emerging as a powerful tool for analyzing and predicting market behavior. In this paper, a method is proposed that is inspired by how professional technical analysts trade. This scheme looks at stock prices of the previous 600 days and predicts whether the stock price will rise or fall 10% or 20% within the next D days. The proposed method uses the Resnet's (a deep learning model) skip connections and logits to increase the probability of the prediction. The model was trained and tested using historical data from both the Korea and US stock markets. The backtest is done using the data from 2020 to 2022. Using the proposed method for the Korea market it gave return of 75.36% having Sharpe ratio of 1.57, which far exceeds the market return by 36% and 0.61, respectively. On the US market it gives total return of 27.17% with Sharpe ratio of 0.61, which outperforms other benchmarks such as NASDAQ, S&P500, DOW JONES index by 17.69% and 0.27, respectively.
IVDec 26, 2025
The Color-Clinical Decoupling: Why Perceptual Calibration Fails Clinical Biomarkers in Smartphone DermatologySungwoo Kang
Smartphone-based tele-dermatology assumes that colorimetric calibration ensures clinical reliability, yet this remains untested for underrepresented skin phototypes. We investigated whether standard calibration translates to reliable clinical biomarkers using 43,425 images from 965 Korean subjects (Fitzpatrick III-IV) across DSLR, tablet, and smartphone devices. While Linear Color Correction Matrix (CCM) normalization reduced color error by 67-77% -- achieving near-clinical accuracy (Delta E < 2.3) -- this success did not translate to biomarker reliability. We identify a phenomenon termed "color-clinical decoupling": despite perceptual accuracy, the Individual Typology Angle (ITA) showed poor inter-device agreement (ICC = 0.40), while the Melanin Index achieved good agreement (ICC = 0.77). This decoupling is driven by the ITA formula's sensitivity to b* channel noise and is further compounded by anatomical variance. Facial region accounts for 25.2% of color variance -- 3.6x greater than device effects (7.0%) -- challenging the efficacy of single-patch calibration. Our results demonstrate that current colorimetric standards are insufficient for clinical-grade biomarker extraction, necessitating region-aware protocols for mobile dermatology.
10.5SYMar 17
Asymmetry-Aware Routing for Industrial Multimodal Monitoring: A Diagnostic FrameworkSungwoo Kang
Multimodal fusion is the default approach for combining heterogeneous sensor streams in industrial monitoring, yet no systematic method exists for determining \textit{when fusion degrades rather than improves} detection performance. We present an \textbf{Asymmetry-Aware Routing Framework} -- a three-step diagnostic procedure (unimodal performance gap, gate weight attribution, modality corruption testing) with formal decision criteria -- that routes multimodal systems toward the appropriate fusion strategy before deployment. We validate the framework on three datasets spanning two routing outcomes: (1)~the OHT/AGV industrial dataset (thermal + sensors, 13{,}121 samples), where the framework correctly identifies severe asymmetry (gap ratio 3.1$\times$) and recommends \textsc{cascade}; (2)~a chain conveyor fault detection scenario (audio + vibration), where moderate asymmetry leads to a \textsc{fuse} recommendation with positive fusion benefit; and (3)~the CWRU bearing dataset, providing controlled validation in both directions. Threshold sensitivity analysis across all three datasets shows that the framework's recommendations are robust to threshold perturbation, with correct routing maintained over a wide parameter plateau. Comparison against simpler diagnostics (gap ratio alone) reveals that Step~1 alone is ambiguous for moderate-asymmetry cases, demonstrating the necessity of the full protocol for reliable routing decisions.
2.0LGMar 15
STAG-CN: Spatio-Temporal Apiary Graph Convolutional Network for Disease Onset Prediction in Beehive Sensor NetworksSungwoo Kang
Honey bee colony losses threaten global pollination services, yet current monitoring systems treat each hive as an isolated unit, ignoring the spatial pathways through which diseases spread across apiaries. This paper introduces the Spatio-Temporal Apiary Graph Convolutional Network (STAG-CN), a graph neural network that models inter-hive relationships for disease onset prediction. STAG-CN operates on a dual adjacency graph combining physical co-location and climatic sensor correlation among hive sessions, and processes multivariate IoT sensor streams through a temporal--spatial--temporal sandwich architecture built on causal dilated convolutions and Chebyshev spectral graph convolutions. Evaluated on the Korean AI Hub apiculture dataset (dataset \#71488) with expanding-window temporal cross-validation, STAG-CN achieves an F1 score of 0.607 at a three-day forecast horizon. An ablation study reveals that the climatic adjacency matrix alone matches full-model performance (F1\,=\,0.607), while the physical adjacency alone yields F1\,=\,0.274, indicating that shared environmental response patterns carry stronger predictive signal than spatial proximity for disease onset. These results establish a proof-of-concept for graph-based biosecurity monitoring in precision apiculture, demonstrating that inter-hive sensor correlations encode disease-relevant information invisible to single-hive approaches.
LGFeb 16
Learning Structural Hardness for Combinatorial Auctions: Instance-Dependent Algorithm Selection via Graph Neural NetworksSungwoo Kang
The Winner Determination Problem (WDP) in combinatorial auctions is NP-hard, and no existing method reliably predicts which instances will defeat fast greedy heuristics. The ML-for-combinatorial-optimization community has focused on learning to \emph{replace} solvers, yet recent evidence shows that graph neural networks (GNNs) rarely outperform well-tuned classical methods on standard benchmarks. We pursue a different objective: learning to predict \emph{when} a given instance is hard for greedy allocation, enabling instance-dependent algorithm selection. We design a 20-dimensional structural feature vector and train a lightweight MLP hardness classifier that predicts the greedy optimality gap with mean absolute error 0.033, Pearson correlation 0.937, and binary classification accuracy 94.7\% across three random seeds. For instances identified as hard -- those exhibiting ``whale-fish'' trap structure where greedy provably fails -- we deploy a heterogeneous GNN specialist that achieves ${\approx}0\%$ optimality gap on all six adversarial configurations tested (vs.\ 3.75--59.24\% for greedy). A hybrid allocator combining the hardness classifier with GNN and greedy solvers achieves 0.51\% overall gap on mixed distributions. Our honest evaluation on CATS benchmarks confirms that GNNs do not outperform Gurobi (0.45--0.71 vs.\ 0.20 gap), motivating the algorithm selection framing. Learning \emph{when} to deploy expensive solvers is more tractable than learning to replace them.
CVOct 10, 2025
Modern Deep Learning Approaches for Cricket Shot Classification: A Comprehensive Baseline StudySungwoo Kang
Cricket shot classification from video sequences remains a challenging problem in sports video analysis, requiring effective modeling of both spatial and temporal features. This paper presents the first comprehensive baseline study comparing seven different deep learning approaches across four distinct research paradigms for cricket shot classification. We implement and systematically evaluate traditional CNN-LSTM architectures, attention-based models, vision transformers, transfer learning approaches, and modern EfficientNet-GRU combinations on a unified benchmark. A critical finding of our study is the significant performance gap between claims in academic literature and practical implementation results. While previous papers reported accuracies of 96\% (Balaji LRCN), 99.2\% (IJERCSE), and 93\% (Sensors), our standardized re-implementations achieve 46.0\%, 55.6\%, and 57.7\% respectively. Our modern SOTA approach, combining EfficientNet-B0 with a GRU-based temporal model, achieves 92.25\% accuracy, demonstrating that substantial improvements are possible with modern architectures and systematic optimization. All implementations follow modern MLOps practices with PyTorch Lightning, providing a reproducible research platform that exposes the critical importance of standardized evaluation protocols in sports video analysis research.