Jae-Han Lee

CV
h-index5
6papers
216citations
Novelty47%
AI Score31

6 Papers

CVAug 23, 2022
Depth Map Decomposition for Monocular Depth Estimation

Jinyoung Jun, Jae-Han Lee, Chul Lee et al.

We propose a novel algorithm for monocular depth estimation that decomposes a metric depth map into a normalized depth map and scale features. The proposed network is composed of a shared encoder and three decoders, called G-Net, N-Net, and M-Net, which estimate gradient maps, a normalized depth map, and a metric depth map, respectively. M-Net learns to estimate metric depths more accurately using relative depth features extracted by G-Net and N-Net. The proposed algorithm has the advantage that it can use datasets without metric depth labels to improve the performance of metric depth estimation. Experimental results on various datasets demonstrate that the proposed algorithm not only provides competitive performance to state-of-the-art algorithms but also yields acceptable results even when only a small amount of metric depth data is available for its training.

CVNov 22, 2022
PNI : Industrial Anomaly Detection using Position and Neighborhood Information

Jaehyeok Bae, Jae-Han Lee, Seyun Kim

Because anomalous samples cannot be used for training, many anomaly detection and localization methods use pre-trained networks and non-parametric modeling to estimate encoded feature distribution. However, these methods neglect the impact of position and neighborhood information on the distribution of normal features. To overcome this, we propose a new algorithm, \textbf{PNI}, which estimates the normal distribution using conditional probability given neighborhood features, modeled with a multi-layer perceptron network. Moreover, position information is utilized by creating a histogram of representative features at each position. Instead of simply resizing the anomaly map, the proposed method employs an additional refine network trained on synthetic anomaly images to better interpolate and account for the shape and edge of the input image. We conducted experiments on the MVTec AD benchmark dataset and achieved state-of-the-art performance, with \textbf{99.56\%} and \textbf{98.98\%} AUROC scores in anomaly detection and localization, respectively.

IVMar 25, 2022
RD-Optimized Trit-Plane Coding of Deep Compressed Image Latent Tensors

Seungmin Jeon, Jae-Han Lee, Chang-Su Kim

DPICT is the first learning-based image codec supporting fine granular scalability. In this paper, we describe how to implement two key components of DPICT efficiently: trit-plane slicing and rate-distortion-optimized (RD-optimized) coding. In DPICT, we transform an image into a latent tensor, represent the tensor in ternary digits (trits), and encode the trits in the decreasing order of significance. For entropy encoding, it is necessary to compute the probability of each trit, which demands high time complexity in both the encoder and the decoder. To reduce the complexity, we develop a parallel computing scheme for the probabilities, which is described in detail with pseudo-codes. Moreover, we compare the trit-plane slicing in DPICT with the alternative bit-plane slicing. Experimental results show that the time complexity is reduced significantly by the parallel computing and that the trit-plane slicing provides better RD performances than the bit-plane slicing.

CVMar 20, 2023
Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

A typical monocular depth estimator is trained for a single camera, so its performance drops severely on images taken with different cameras. To address this issue, we propose a versatile depth estimator (VDE), composed of a common relative depth estimator (CRDE) and multiple relative-to-metric converters (R2MCs). The CRDE extracts relative depth information, and each R2MC converts the relative information to predict metric depths for a specific camera. The proposed VDE can cope with diverse scenes, including both indoor and outdoor scenes, with only a 1.12\% parameter increase per camera. Experimental results demonstrate that VDE supports multiple cameras effectively and efficiently and also achieves state-of-the-art performance in the conventional single-camera scenario.

IVDec 12, 2021Code
DPICT: Deep Progressive Image Compression Using Trit-Planes

Jae-Han Lee, Seungmin Jeon, Kwang Pyo Choi et al.

We propose the deep progressive image compression using trit-planes (DPICT) algorithm, which is the first learning-based codec supporting fine granular scalability (FGS). First, we transform an image into a latent tensor using an analysis network. Then, we represent the latent tensor in ternary digits (trits) and encode it into a compressed bitstream trit-plane by trit-plane in the decreasing order of significance. Moreover, within each trit-plane, we sort the trits according to their rate-distortion priorities and transmit more important information first. Since the compression network is less optimized for the cases of using fewer trit-planes, we develop a postprocessing network for refining reconstructed images at low rates. Experimental results show that DPICT outperforms conventional progressive codecs significantly, while enabling FGS transmission. Codes are available at https://github.com/jaehanlee-mcl/DPICT.

CVApr 30, 2024
Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement

Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

The main function of depth completion is to compensate for an insufficient and unpredictable number of sparse depth measurements of hardware sensors. However, existing research on depth completion assumes that the sparsity -- the number of points or LiDAR lines -- is fixed for training and testing. Hence, the completion performance drops severely when the number of sparse depths changes significantly. To address this issue, we propose the sparsity-adaptive depth refinement (SDR) framework, which refines monocular depth estimates using sparse depth points. For SDR, we propose the masked spatial propagation network (MSPN) to perform SDR with a varying number of sparse depths effectively by gradually propagating sparse depth information throughout the entire depth map. Experimental results demonstrate that MPSN achieves state-of-the-art performance on both SDR and conventional depth completion scenarios.