Jaesik Min

h-index9

5papers

2,770citations

Novelty45%

AI Score41

Ranked #67,120 of 194,257 authors (top 35%)#22,905 in CV (top 39%)

5 Papers

2.6CVAug 12, 2022Code

Character decomposition to resolve class imbalance problem in Hangul OCR

Geonuk Kim, Jaemin Son, Kanghyu Lee et al.

We present a novel approach to OCR(Optical Character Recognition) of Korean character, Hangul. As a phonogram, Hangul can represent 11,172 different characters with only 52 graphemes, by describing each character with a combination of the graphemes. As the total number of the characters could overwhelm the capacity of a neural network, the existing OCR encoding methods pre-define a smaller set of characters that are frequently used. This design choice naturally compromises the performance on long-tailed characters in the distribution. In this work, we demonstrate that grapheme encoding is not only efficient but also performant for Hangul OCR. Benchmark tests show that our approach resolves two main problems of Hangul OCR: class imbalance and target class selection.

IVJun 26

Anatomy-Grounded Synthetic Coronary Angiography for Geometry-Informed Multi-View Matching

In Kyu Lee, Sumin Seo, Jaesik Min

Accurate correspondence matching across multiple angiographic views is the prerequisite for 3D coronary reconstruction and interventional guidance. However, the development of robust deep learning models for this task has been stifled by a fundamental data bottleneck. Obtaining ground truth for matching tasks in angiography pairs is prohibitively expensive and hard to scale. To overcome this barrier, we introduce a physically-grounded data generation framework that synthesizes high-fidelity Digital Reconstructed Radiographs (DRRs) from 3D Coronary CT Angiography (CCTA) volumes. Our framework generates dense, highly accurate 3D-to-2D projection labels by simulating realistic C-arm acquisition geometry on patient anatomy at zero human cost. Leveraging this dense supervision, we propose a Geometry-Informed Matching Module (GIMM) that integrates global feature and anatomical structure into correspondence learning. Unlike real angiography where assessment relies on subjective human annotation, our dataset provides 2D correspondence labels with paired images, allowing human-free evaluation. We comprehensively evaluate our method on the proposed CT-derived DRR dataset and demonstrate improvements over other matching baseline models.

8.6IVAug 1, 2025

Diffusion-Based User-Guided Data Augmentation for Coronary Stenosis Detection

Sumin Seo, In Kyu Lee, Hyun-Woo Kim et al.

Coronary stenosis is a major risk factor for ischemic heart events leading to increased mortality, and medical treatments for this condition require meticulous, labor-intensive analysis. Coronary angiography provides critical visual cues for assessing stenosis, supporting clinicians in making informed decisions for diagnosis and treatment. Recent advances in deep learning have shown great potential for automated localization and severity measurement of stenosis. In real-world scenarios, however, the success of these competent approaches is often hindered by challenges such as limited labeled data and class imbalance. In this study, we propose a novel data augmentation approach that uses an inpainting method based on a diffusion model to generate realistic lesions, allowing user-guided control of severity. Extensive evaluation on lesion detection and severity classification across various synthetic dataset sizes shows superior performance of our method on both a large-scale in-house dataset and a public coronary angiography dataset. Furthermore, our approach maintains high detection and classification performance even when trained with limited data, highlighting its clinical importance in improving the assessment of severity of stenosis and optimizing data utilization for more reliable decision support.

22.5LGOct 8, 2021Code

Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning

Sungyong Baik, Janghoon Choi, Heewon Kim et al.

In few-shot learning scenarios, the challenge is to generalize and perform well on new unseen examples when only very few labeled examples are available for each task. Model-agnostic meta-learning (MAML) has gained the popularity as one of the representative few-shot learning methods for its flexibility and applicability to diverse problems. However, MAML and its variants often resort to a simple loss function without any auxiliary loss function or regularization terms that can help achieve better generalization. The problem lies in that each application and task may require different auxiliary loss function, especially when tasks are diverse and distinct. Instead of attempting to hand-design an auxiliary loss function for each application and task, we introduce a new meta-learning framework with a loss function that adapts to each task. Our proposed framework, named Meta-Learning with Task-Adaptive Loss Function (MeTAL), demonstrates the effectiveness and the flexibility across various domains, such as few-shot classification and few-shot regression.

2.4CVJun 26, 2017Code

End-to-end Learning of Image based Lane-Change Decision

Seong-Gyun Jeong, Jiwon Kim, Sujung Kim et al.

We propose an image based end-to-end learning framework that helps lane-change decisions for human drivers and autonomous vehicles. The proposed system, Safe Lane-Change Aid Network (SLCAN), trains a deep convolutional neural network to classify the status of adjacent lanes from rear view images acquired by cameras mounted on both sides of the vehicle. Rather than depending on any explicit object detection or tracking scheme, SLCAN reads the whole input image and directly decides whether initiation of the lane-change at the moment is safe or not. We collected and annotated 77,273 rear side view images to train and test SLCAN. Experimental results show that the proposed framework achieves 96.98% classification accuracy although the test images are from unseen roadways. We also visualize the saliency map to understand which part of image SLCAN looks at for correct decisions.