Hyoseok Hwang

h-index10

8papers

317citations

Novelty48%

AI Score42

Ranked #62,382 of 194,257 authors (top 32%)#21,450 in CV (top 36%)

8 Papers

1.5CVApr 7, 2023

Automated Solubility Analysis System and Method Using Computer Vision and Machine Learning

Gahee Kim, Minwoo Jeon, Hyun Do Choi et al.

In this study, a novel active solubility sensing device using computer vision is proposed to improve separation purification performance and prevent malfunctions of separation equipment such as preparative liquid chromatographers and evaporators. The proposed device actively measures the solubility by transmitting a solution using a background image. The proposed system is a combination of a device that uses a background image and a method for estimating the dissolution and particle presence by changing the background image. The proposed device consists of four parts: camera, display, adjustment, and server units. The camera unit is made up of a rear image sensor on a mobile phone. The display unit is comprised of a tablet screen. The adjustment unit is composed of rotating and height-adjustment jigs. Finally, the server unit consists of a socket server for communication between the units and a PC, including an automated solubility analysis system implemented in Python. The dissolution status of the solution was divided into four categories and a case study was conducted. The algorithms were trained based on these results. Six organic materials and four organic solvents were combined with 202 tests to train the developed algorithm. As a result, the evaluation rate for the dissolution state exhibited an accuracy of 95 %. In addition, the device and method must develop a feedback function that can add a solvent or solute after dissolution detection using solubility results for use in autonomous systems, such as a synthetic automation system. Finally, the diversification of the sensing method is expected to extend not only to the solution but also to the solubility and homogeneity analysis of the film.

5.9CVNov 17, 2023Code

A2XP: Towards Private Domain Generalization

Geunhyeok Yu, Hyoseok Hwang

Deep Neural Networks (DNNs) have become pivotal in various fields, especially in computer vision, outperforming previous methodologies. A critical challenge in their deployment is the bias inherent in data across different domains, such as image style and environmental conditions, leading to domain gaps. This necessitates techniques for learning general representations from biased training data, known as domain generalization. This paper presents Attend to eXpert Prompts (A2XP), a novel approach for domain generalization that preserves the privacy and integrity of the network architecture. A2XP consists of two phases: Expert Adaptation and Domain Generalization. In the first phase, prompts for each source domain are optimized to guide the model towards the optimal direction. In the second phase, two embedder networks are trained to effectively amalgamate these expert prompts, aiming for an optimal output. Our extensive experiments demonstrate that A2XP achieves state-of-the-art results over existing non-private domain generalization methods. The experimental results validate that the proposed approach not only tackles the domain generalization challenge in DNNs but also offers a privacy-preserving, efficient solution to the broader field of computer vision.

1.5CVMar 9, 2023Code

Decision-BADGE: Decision-based Adversarial Batch Attack with Directional Gradient Estimation

Geunhyeok Yu, Minwoo Jeon, Hyoseok Hwang

The susceptibility of deep neural networks (DNNs) to adversarial examples has prompted an increase in the deployment of adversarial attacks. Image-agnostic universal adversarial perturbations (UAPs) are much more threatening, but many limitations exist to implementing UAPs in real-world scenarios where only binary decisions are returned. In this research, we propose Decision-BADGE, a novel method to craft universal adversarial perturbations for executing decision-based black-box attacks. To optimize perturbation with decisions, we addressed two challenges, namely the magnitude and the direction of the gradient. First, we use batch loss, differences from distributions of ground truth, and accumulating decisions in batches to determine the magnitude of the gradient. This magnitude is applied in the direction of the revised simultaneous perturbation stochastic approximation (SPSA) to update the perturbation. This simple yet efficient method can be easily extended to score-based attacks as well as targeted attacks. Experimental validation across multiple victim models demonstrates that the Decision-BADGE outperforms existing attack methods, even image-specific and score-based attacks. In particular, our proposed method shows a superior success rate with less training time. The research also shows that Decision-BADGE can successfully deceive unseen victim models and accurately target specific classes.

5.8CVJun 18

Geometry-Preserving in 3D Gaussian Splatting for LiDAR-Camera Extrinsic Calibration

Kyoleen Kwak, Daeho Kim, Jeong Woon Lee et al.

Accurate LiDAR-camera calibration is essential for robust multi-modal perception. Targetless approaches avoid manual setup but remain limited by the scarcity of discriminative cross-modal features. Recent methods address this by reconstructing the scene within a differentiable model, enabling extrinsic optimization through dense photometric supervision. Among these, 3D Gaussian Splatting (3DGS) has been widely adopted as a geometric proxy that bridges LiDAR and camera within a single differentiable framework. However, since 3DGS was originally designed for novel view synthesis, existing methods tend to prioritize rendering quality, causing the proxy geometry to drift from the true LiDAR structure. We propose a framework that preserves the metric geometry of the Gaussian proxy by aggregating multi-view LiDAR observations for dense depth supervision and blocking photometric gradients from updating the Gaussian spatial parameters. We validate our method on public driving datasets, where it consistently outperforms existing targetless methods in calibration accuracy.

1.5CVFeb 22

Keep it SymPL: Symbolic Projective Layout for Allocentric Spatial Reasoning in Vision-Language Models

Jaeyun Jang, Seunghui Shin, Taeho Park et al.

Perspective-aware spatial reasoning involves understanding spatial relationships from specific viewpoints-either egocentric (observer-centered) or allocentric (object-centered). While vision-language models (VLMs) perform well in egocentric settings, their performance deteriorates when reasoning from allocentric viewpoints, where spatial relations must be inferred from the perspective of objects within the scene. In this study, we address this underexplored challenge by introducing Symbolic Projective Layout (SymPL), a framework that reformulates allocentric reasoning into symbolic-layout forms that VLMs inherently handle well. By leveraging four key factors-projection, abstraction, bipartition, and localization-SymPL converts allocentric questions into structured symbolic-layout representations. Extensive experiments demonstrate that this reformulation substantially improves performance in both allocentric and egocentric tasks, enhances robustness under visual illusions and multi-view scenarios, and that each component contributes critically to these gains. These results show that SymPL provides an effective and principled approach for addressing complex perspective-aware spatial reasoning.

3.6CVSep 25, 2025

SiNGER: A Clearer Voice Distills Vision Transformers Further

Geunhyeok Yu, Sunjae Jeong, Yoonyoung Choi et al.

Vision Transformers are widely adopted as the backbone of vision foundation models, but they are known to produce high-norm artifacts that degrade representation quality. When knowledge distillation transfers these features to students, high-norm artifacts dominate the objective, so students overfit to artifacts and underweight informative signals, diminishing the gains from larger models. Prior work attempted to remove artifacts but encountered an inherent trade-off between artifact suppression and preserving informative signals from teachers. To address this, we introduce Singular Nullspace-Guided Energy Reallocation (SiNGER), a novel distillation framework that suppresses artifacts while preserving informative signals. The key idea is principled teacher feature refinement: during refinement, we leverage the nullspace-guided perturbation to preserve information while suppressing artifacts. Then, the refined teacher's features are distilled to a student. We implement this perturbation efficiently with a LoRA-based adapter that requires minimal structural modification. Extensive experiments show that \oursname consistently improves student models, achieving state-of-the-art performance in multiple downstream tasks and producing clearer and more interpretable representations.

3.2ROMar 11, 2025

ForceGrip: Reference-Free Curriculum Learning for Realistic Grip Force Control in VR Hand Manipulation

DongHeun Han, Byungmin Kim, RoUn Lee et al.

Realistic Hand manipulation is a key component of immersive virtual reality (VR), yet existing methods often rely on kinematic approach or motion-capture datasets that omit crucial physical attributes such as contact forces and finger torques. Consequently, these approaches prioritize tight, one-size-fits-all grips rather than reflecting users' intended force levels. We present ForceGrip, a deep learning agent that synthesizes realistic hand manipulation motions, faithfully reflecting the user's grip force intention. Instead of mimicking predefined motion datasets, ForceGrip uses generated training scenarios-randomizing object shapes, wrist movements, and trigger input flows-to challenge the agent with a broad spectrum of physical interactions. To effectively learn from these complex tasks, we employ a three-phase curriculum learning framework comprising Finger Positioning, Intention Adaptation, and Dynamic Stabilization. This progressive strategy ensures stable hand-object contact, adaptive force control based on user inputs, and robust handling under dynamic conditions. Additionally, a proximity reward function enhances natural finger motions and accelerates training convergence. Quantitative and qualitative evaluations reveal ForceGrip's superior force controllability and plausibility compared to state-of-the-art methods. Demo videos are available as supplementary material and the code is provided at https://han-dongheun.github.io/ForceGrip.

1.1CVJun 23, 2016

3D Display Calibration by Visual Pattern Analysis

Hyoseok Hwang, Hyun Sung Chang, Dongkyung Nam et al.

Nearly all 3D displays need calibration for correct rendering. More often than not, the optical elements in a 3D display are misaligned from the designed parameter setting. As a result, 3D magic does not perform well as intended. The observed images tend to get distorted. In this paper, we propose a novel display calibration method to fix the situation. In our method, a pattern image is displayed on the panel and a camera takes its pictures twice at different positions. Then, based on a quantitative model, we extract all display parameters (i.e., pitch, slanted angle, gap or thickness, offset) from the observed patterns in the captured images. For high accuracy and robustness, our method analyzes the patterns mostly in frequency domain. We conduct two types of experiments for validation; one with optical simulation for quantitative results and the other with real-life displays for qualitative assessment. Experimental results demonstrate that our method is quite accurate, about a half order of magnitude higher than prior work; is efficient, spending less than 2 s for computation; and is robust to noise, working well in the SNR regime as low as 6 dB.