Dae Sin Kim

h-index16

4papers

170citations

Novelty53%

AI Score26

Ranked #161,197 of 194,257 authors (top 83%)#35,304 in LG (top 88%)

4 Papers

7.8LGJun 12, 2022

PAC-Net: A Model Pruning Approach to Inductive Transfer Learning

Sanghoon Myung, In Huh, Wonik Jang et al.

Inductive transfer learning aims to learn from a small amount of training data for the target task by utilizing a pre-trained model from the source task. Most strategies that involve large-scale deep learning models adopt initialization with the pre-trained model and fine-tuning for the target task. However, when using over-parameterized models, we can often prune the model without sacrificing the accuracy of the source task. This motivates us to adopt model pruning for transfer learning with deep learning models. In this paper, we propose PAC-Net, a simple yet effective approach for transfer learning based on pruning. PAC-Net consists of three steps: Prune, Allocate, and Calibrate (PAC). The main idea behind these steps is to identify essential weights for the source task, fine-tune on the source task by updating the essential weights, and then calibrate on the target task by updating the remaining redundant weights. Under the various and extensive set of inductive transfer learning experiments, we show that our method achieves state-of-the-art performance by a large margin.

3.3SPApr 19, 2022

Restructuring TCAD System: Teaching Traditional TCAD New Tricks

Sanghoon Myung, Wonik Jang, Seonghoon Jin et al.

Traditional TCAD simulation has succeeded in predicting and optimizing the device performance; however, it still faces a massive challenge - a high computational cost. There have been many attempts to replace TCAD with deep learning, but it has not yet been completely replaced. This paper presents a novel algorithm restructuring the traditional TCAD system. The proposed algorithm predicts three-dimensional (3-D) TCAD simulation in real-time while capturing a variance, enables deep learning and TCAD to complement each other, and fully resolves convergence errors.

5.0MLApr 6, 2021

A Novel Approach for Semiconductor Etching Process with Inductive Biases

Sanghoon Myung, Hyunjae Jang, Byungseon Choi et al.

The etching process is one of the most important processes in semiconductor manufacturing. We have introduced the state-of-the-art deep learning model to predict the etching profiles. However, the significant problems violating physics have been found through various techniques such as explainable artificial intelligence and representation of prediction uncertainty. To address this problem, this paper presents a novel approach to apply the inductive biases for etching process. We demonstrate that our approach fits the measurement faster than physical simulator while following the physical behavior. Our approach would bring a new opportunity for better etching process with higher accuracy and lower cost.

24.8LGFeb 12, 2021

Learning Student-Friendly Teacher Networks for Knowledge Distillation

Dae Young Park, Moon-Hyun Cha, Changwook Jeong et al.

We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student. Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students and, consequently, more appropriate for knowledge transfer. In other words, at the time of optimizing a teacher model, the proposed algorithm learns the student branches jointly to obtain student-friendly representations. Since the main goal of our approach lies in training teacher models and the subsequent knowledge distillation procedure is straightforward, most of the existing knowledge distillation methods can adopt this technique to improve the performance of diverse student models in terms of accuracy and convergence speed. The proposed algorithm demonstrates outstanding accuracy in several well-known knowledge distillation techniques with various combinations of teacher and student models even in the case that their architectures are heterogeneous and there is no prior knowledge about student models at the time of training teacher networks.