Wonsook Lee

CV
h-index2
6papers
21citations
Novelty44%
AI Score26

6 Papers

CVMar 27, 2023
Comparison between layer-to-layer network training and conventional network training using Deep Convolutional Neural Networks

Kiran Kumar Ashish Bhyravabhottla, WonSook Lee

Title: Comparison between layer-to-layer network training and conventional network training using Deep Convolutional Neural Networks Abstract: Convolutional neural networks (CNNs) are widely used in various applications due to their effectiveness in extracting features from data. However, the performance of a CNN heavily depends on its architecture and training process. In this study, we propose a layer-to-layer training method and compare its performance with the conventional training method. In the layer-to-layer training approach, we treat a portion of the early layers as a student network and the later layers as a teacher network. During each training step, we incrementally train the student network to learn from the output of the teacher network, and vice versa. We evaluate this approach on VGG16, ResNext, and DenseNet networks without pre-trained ImageNet weights and a regular CNN model. Our experiments show that the layer-to-layer training method outperforms the conventional training method for both models. Specifically, we achieve higher accuracy on the test set for the VGG16, ResNext, and DeseNet networks and the CNN model using layer-to-layer training compared to the conventional training method. Overall, our study highlights the importance of layer-wise training in CNNs and suggests that layer-to-layer training can be a promising approach for improving the accuracy of CNNs.

CVFeb 3, 2025
Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection

Reza Sadeghian, Niloofar Hooshyaripour, Chris Joslin et al.

Accurate and robust 3D object detection is essential for autonomous driving, where fusing data from sensors like LiDAR and camera enhances detection accuracy. However, sensor malfunctions such as corruption or disconnection can degrade performance, and existing fusion models often struggle to maintain reliability when one modality fails. To address this, we propose ReliFusion, a novel LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space. ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, which captures dependencies across frames to stabilize predictions over time; the Reliability module, which assigns confidence scores to quantify the dependability of each modality under challenging conditions; and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module, which dynamically balances information from LiDAR and camera modalities based on these confidence scores. Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.

CVMar 14, 2025
Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information

Xuanqi Zhang, Jieun Lee, Chris Joslin et al.

We present a novel framework for enhancing the visual fidelity and consistency of text-guided 3D Gaussian Splatting (3DGS) editing. Existing editing approaches face two critical challenges: inconsistent geometric reconstructions across multiple viewpoints, particularly in challenging camera positions, and ineffective utilization of depth information during image manipulation, resulting in over-texture artifacts and degraded object boundaries. To address these limitations, we introduce: 1) A complementary information mutual learning network that enhances depth map estimation from 3DGS, enabling precise depth-conditioned 3D editing while preserving geometric structures. 2) A wavelet consensus attention mechanism that effectively aligns latent codes during the diffusion denoising process, ensuring multi-view consistency in the edited results. Through extensive experimentation, our method demonstrates superior performance in rendering quality and view consistency compared to state-of-the-art approaches. The results validate our framework as an effective solution for text-guided editing of 3D scenes.

IVMay 14, 2021
A Frequency Domain Constraint for Synthetic and Real X-ray Image Super Resolution

Qing Ma, Jae Chul Koh, WonSook Lee

Synthetic X-ray images are simulated X-ray images projected from CT data. High-quality synthetic X-ray images can facilitate various applications such as surgical image guidance systems and VR training simulations. However, it is difficult to produce high-quality arbitrary view synthetic X-ray images in real-time due to different CT slice thickness, high computational cost, and the complexity of algorithms. Our goal is to generate high-resolution synthetic X-ray images in real-time by upsampling low-resolution images with deep learning-based super-resolution methods. Reference-based Super Resolution (RefSR) has been well studied in recent years and has shown higher performance than traditional Single Image Super-Resolution (SISR). It can produce fine details by utilizing the reference image but still inevitably generates some artifacts and noise. In this paper, we introduce frequency domain loss as a constraint to further improve the quality of the RefSR results with fine details and without obvious artifacts. To the best of our knowledge, this is the first paper utilizing the frequency domain for the loss functions in the field of super-resolution. We achieved good results in evaluating our method on both synthetic and real X-ray image datasets.

LGSep 22, 2020
PS8-Net: A Deep Convolutional Neural Network to Predict the Eight-State Protein Secondary Structure

Md Aminur Rab Ratul, Maryam Tavakol Elahi, M. Hamed Mozaffari et al.

Protein secondary structure is crucial to creating an information bridge between the primary and tertiary (3D) structures. Precise prediction of eight-state protein secondary structure (PSS) has significantly utilized in the structural and functional analysis of proteins in bioinformatics. Deep learning techniques have been recently applied in this research area and raised the eight-state (Q8) protein secondary structure prediction accuracy remarkably. Nevertheless, from a theoretical standpoint, there are still lots of rooms for improvement, specifically in the eight-state PSS prediction. In this study, we have presented a new deep convolutional neural network (DCNN), namely PS8-Net, to enhance the accuracy of eight-class PSS prediction. The input of this architecture is a carefully constructed feature matrix from the proteins sequence features and profile features. We introduce a new PS8 module in the network, which is applied with skip connection to extracting the long-term inter-dependencies from higher layers, obtaining local contexts in earlier layers, and achieving global information during secondary structure prediction. Our proposed PS8-Net achieves 76.89%, 71.94%, 76.86%, and 75.26% Q8 accuracy respectively on benchmark CullPdb6133, CB513, CASP10, and CASP11 datasets. This architecture enables the efficient processing of local and global interdependencies between amino acids to make an accurate prediction of each class. To the best of our knowledge, PS8-Net experiment results demonstrate that it outperforms all the state-of-the-art methods on the aforementioned benchmark datasets.

CVNov 29, 2016
3D Ultrasound image segmentation: A Survey

Mohammad Hamed Mozaffari, WonSook Lee

Three-dimensional Ultrasound image segmentation methods are surveyed in this paper. The focus of this report is to investigate applications of these techniques and a review of the original ideas and concepts. Although many two-dimensional image segmentation in the literature have been considered as a three-dimensional approach by mistake but we review them as a three-dimensional technique. We select the studies that have addressed the problem of medical three-dimensional Ultrasound image segmentation utilizing their proposed techniques. The evaluation methods and comparison between them are presented and tabulated in terms of evaluation techniques, interactivity, and robustness.