Hyoseok Hwang

CV
h-index13
11papers
22citations
Novelty50%
AI Score51

11 Papers

CVApr 7, 2023
Automated Solubility Analysis System and Method Using Computer Vision and Machine Learning

Gahee Kim, Minwoo Jeon, Hyun Do Choi et al.

In this study, a novel active solubility sensing device using computer vision is proposed to improve separation purification performance and prevent malfunctions of separation equipment such as preparative liquid chromatographers and evaporators. The proposed device actively measures the solubility by transmitting a solution using a background image. The proposed system is a combination of a device that uses a background image and a method for estimating the dissolution and particle presence by changing the background image. The proposed device consists of four parts: camera, display, adjustment, and server units. The camera unit is made up of a rear image sensor on a mobile phone. The display unit is comprised of a tablet screen. The adjustment unit is composed of rotating and height-adjustment jigs. Finally, the server unit consists of a socket server for communication between the units and a PC, including an automated solubility analysis system implemented in Python. The dissolution status of the solution was divided into four categories and a case study was conducted. The algorithms were trained based on these results. Six organic materials and four organic solvents were combined with 202 tests to train the developed algorithm. As a result, the evaluation rate for the dissolution state exhibited an accuracy of 95 %. In addition, the device and method must develop a feedback function that can add a solvent or solute after dissolution detection using solubility results for use in autonomous systems, such as a synthetic automation system. Finally, the diversification of the sensing method is expected to extend not only to the solution but also to the solubility and homogeneity analysis of the film.

CVNov 17, 2023
A2XP: Towards Private Domain Generalization

Geunhyeok Yu, Hyoseok Hwang

Deep Neural Networks (DNNs) have become pivotal in various fields, especially in computer vision, outperforming previous methodologies. A critical challenge in their deployment is the bias inherent in data across different domains, such as image style and environmental conditions, leading to domain gaps. This necessitates techniques for learning general representations from biased training data, known as domain generalization. This paper presents Attend to eXpert Prompts (A2XP), a novel approach for domain generalization that preserves the privacy and integrity of the network architecture. A2XP consists of two phases: Expert Adaptation and Domain Generalization. In the first phase, prompts for each source domain are optimized to guide the model towards the optimal direction. In the second phase, two embedder networks are trained to effectively amalgamate these expert prompts, aiming for an optimal output. Our extensive experiments demonstrate that A2XP achieves state-of-the-art results over existing non-private domain generalization methods. The experimental results validate that the proposed approach not only tackles the domain generalization challenge in DNNs but also offers a privacy-preserving, efficient solution to the broader field of computer vision.

CVMar 9, 2023
Decision-BADGE: Decision-based Adversarial Batch Attack with Directional Gradient Estimation

Geunhyeok Yu, Minwoo Jeon, Hyoseok Hwang

The susceptibility of deep neural networks (DNNs) to adversarial examples has prompted an increase in the deployment of adversarial attacks. Image-agnostic universal adversarial perturbations (UAPs) are much more threatening, but many limitations exist to implementing UAPs in real-world scenarios where only binary decisions are returned. In this research, we propose Decision-BADGE, a novel method to craft universal adversarial perturbations for executing decision-based black-box attacks. To optimize perturbation with decisions, we addressed two challenges, namely the magnitude and the direction of the gradient. First, we use batch loss, differences from distributions of ground truth, and accumulating decisions in batches to determine the magnitude of the gradient. This magnitude is applied in the direction of the revised simultaneous perturbation stochastic approximation (SPSA) to update the perturbation. This simple yet efficient method can be easily extended to score-based attacks as well as targeted attacks. Experimental validation across multiple victim models demonstrates that the Decision-BADGE outperforms existing attack methods, even image-specific and score-based attacks. In particular, our proposed method shows a superior success rate with less training time. The research also shows that Decision-BADGE can successfully deceive unseen victim models and accurately target specific classes.

CVApr 3, 2025Code
ESC: Erasing Space Concept for Knowledge Deletion

Tae-Young Lee, Sundong Park, Minwoo Jeon et al.

As concerns regarding privacy in deep learning continue to grow, individuals are increasingly apprehensive about the potential exploitation of their personal knowledge in trained models. Despite several research efforts to address this, they often fail to consider the real-world demand from users for complete knowledge erasure. Furthermore, our investigation reveals that existing methods have a risk of leaking personal knowledge through embedding features. To address these issues, we introduce a novel concept of Knowledge Deletion (KD), an advanced task that considers both concerns, and provides an appropriate metric, named Knowledge Retention score (KR), for assessing knowledge retention in feature space. To achieve this, we propose a novel training-free erasing approach named Erasing Space Concept (ESC), which restricts the important subspace for the forgetting knowledge by eliminating the relevant activations in the feature. In addition, we suggest ESC with Training (ESC-T), which uses a learnable mask to better balance the trade-off between forgetting and preserving knowledge in KD. Our extensive experiments on various datasets and models demonstrate that our proposed methods achieve the fastest and state-of-the-art performance. Notably, our methods are applicable to diverse forgetting scenarios, such as facial domain setting, demonstrating the generalizability of our methods. The code is available at http://github.com/KU-VGI/ESC .

LGJan 30
Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic

Jeong Woon Lee, Kyoleen Kwak, Daeho Kim et al.

Policies learned via continuous actor-critic methods often exhibit erratic, high-frequency oscillations, making them unsuitable for physical deployment. Current approaches attempt to enforce smoothness by directly regularizing the policy's output. We argue that this approach treats the symptom rather than the cause. In this work, we theoretically establish that policy non-smoothness is fundamentally governed by the differential geometry of the critic. By applying implicit differentiation to the actor-critic objective, we prove that the sensitivity of the optimal policy is bounded by the ratio of the Q-function's mixed-partial derivative (noise sensitivity) to its action-space curvature (signal distinctness). To empirically validate this theoretical insight, we introduce PAVE (Policy-Aware Value-field Equalization), a critic-centric regularization framework that treats the critic as a scalar field and stabilizes its induced action-gradient field. PAVE rectifies the learning signal by minimizing the Q-gradient volatility while preserving local curvature. Experimental results demonstrate that PAVE achieves smoothness and robustness comparable to policy-side smoothness regularization methods, while maintaining competitive task performance, without modifying the actor.

LGJan 27
Scale-Consistent State-Space Dynamics via Fractal of Stationary Transformations

Geunhyeok Yu, Hyoseok Hwang

Recent deep learning models increasingly rely on depth without structural guarantees on the validity of intermediate representations, rendering early stopping and adaptive computation ill-posed. We address this limitation by formulating a structural requirement for state-space model's scale-consistent latent dynamics across iterative refinement, and derive Fractal of Stationary Transformations (FROST), which enforces a self-similar representation manifold through a fractal inductive bias. Under this geometry, intermediate states correspond to different resolutions of a shared representation, and we provide a geometric analysis establishing contraction and stable convergence across iterations. As a consequence of this scale-consistent structure, halting naturally admits a ranking-based formulation driven by intrinsic feature quality rather than extrinsic objectives. Controlled experiments on ImageNet-100 empirically verify the predicted scale-consistent behavior, showing that adaptive efficiency emerges from the aligned latent geometry.

LGJan 26
Enhancing Control Policy Smoothness by Aligning Actions with Predictions from Preceding States

Kyoleen Kwak, Hyoseok Hwang

Deep reinforcement learning has proven to be a powerful approach to solving control tasks, but its characteristic high-frequency oscillations make it difficult to apply in real-world environments. While prior methods have addressed action oscillations via architectural or loss-based methods, the latter typically depend on heuristic or synthetic definitions of state similarity to promote action consistency, which often fail to accurately reflect the underlying system dynamics. In this paper, we propose a novel loss-based method by introducing a transition-induced similar state. The transition-induced similar state is defined as the distribution of next states transitioned from the previous state. Since it utilizes only environmental feedback and actually collected data, it better captures system dynamics. Building upon this foundation, we introduce Action Smoothing by Aligning Actions with Predictions from Preceding States (ASAP), an action smoothing method that effectively mitigates action oscillations. ASAP enforces action smoothness by aligning the actions with those taken in transition-induced similar states and by penalizing second-order differences to suppress high-frequency oscillations. Experiments in Gymnasium and Isaac-Lab environments demonstrate that ASAP yields smoother control and improved policy performance over existing methods.

CVFeb 22
Keep it SymPL: Symbolic Projective Layout for Allocentric Spatial Reasoning in Vision-Language Models

Jaeyun Jang, Seunghui Shin, Taeho Park et al.

Perspective-aware spatial reasoning involves understanding spatial relationships from specific viewpoints-either egocentric (observer-centered) or allocentric (object-centered). While vision-language models (VLMs) perform well in egocentric settings, their performance deteriorates when reasoning from allocentric viewpoints, where spatial relations must be inferred from the perspective of objects within the scene. In this study, we address this underexplored challenge by introducing Symbolic Projective Layout (SymPL), a framework that reformulates allocentric reasoning into symbolic-layout forms that VLMs inherently handle well. By leveraging four key factors-projection, abstraction, bipartition, and localization-SymPL converts allocentric questions into structured symbolic-layout representations. Extensive experiments demonstrate that this reformulation substantially improves performance in both allocentric and egocentric tasks, enhances robustness under visual illusions and multi-view scenarios, and that each component contributes critically to these gains. These results show that SymPL provides an effective and principled approach for addressing complex perspective-aware spatial reasoning.

CVSep 25, 2025
SiNGER: A Clearer Voice Distills Vision Transformers Further

Geunhyeok Yu, Sunjae Jeong, Yoonyoung Choi et al.

Vision Transformers are widely adopted as the backbone of vision foundation models, but they are known to produce high-norm artifacts that degrade representation quality. When knowledge distillation transfers these features to students, high-norm artifacts dominate the objective, so students overfit to artifacts and underweight informative signals, diminishing the gains from larger models. Prior work attempted to remove artifacts but encountered an inherent trade-off between artifact suppression and preserving informative signals from teachers. To address this, we introduce Singular Nullspace-Guided Energy Reallocation (SiNGER), a novel distillation framework that suppresses artifacts while preserving informative signals. The key idea is principled teacher feature refinement: during refinement, we leverage the nullspace-guided perturbation to preserve information while suppressing artifacts. Then, the refined teacher's features are distilled to a student. We implement this perturbation efficiently with a LoRA-based adapter that requires minimal structural modification. Extensive experiments show that \oursname consistently improves student models, achieving state-of-the-art performance in multiple downstream tasks and producing clearer and more interpretable representations.

ROMar 11, 2025
ForceGrip: Reference-Free Curriculum Learning for Realistic Grip Force Control in VR Hand Manipulation

DongHeun Han, Byungmin Kim, RoUn Lee et al.

Realistic Hand manipulation is a key component of immersive virtual reality (VR), yet existing methods often rely on kinematic approach or motion-capture datasets that omit crucial physical attributes such as contact forces and finger torques. Consequently, these approaches prioritize tight, one-size-fits-all grips rather than reflecting users' intended force levels. We present ForceGrip, a deep learning agent that synthesizes realistic hand manipulation motions, faithfully reflecting the user's grip force intention. Instead of mimicking predefined motion datasets, ForceGrip uses generated training scenarios-randomizing object shapes, wrist movements, and trigger input flows-to challenge the agent with a broad spectrum of physical interactions. To effectively learn from these complex tasks, we employ a three-phase curriculum learning framework comprising Finger Positioning, Intention Adaptation, and Dynamic Stabilization. This progressive strategy ensures stable hand-object contact, adaptive force control based on user inputs, and robust handling under dynamic conditions. Additionally, a proximity reward function enhances natural finger motions and accelerates training convergence. Quantitative and qualitative evaluations reveal ForceGrip's superior force controllability and plausibility compared to state-of-the-art methods. Demo videos are available as supplementary material and the code is provided at https://han-dongheun.github.io/ForceGrip.

CVJun 23, 2016
3D Display Calibration by Visual Pattern Analysis

Hyoseok Hwang, Hyun Sung Chang, Dongkyung Nam et al.

Nearly all 3D displays need calibration for correct rendering. More often than not, the optical elements in a 3D display are misaligned from the designed parameter setting. As a result, 3D magic does not perform well as intended. The observed images tend to get distorted. In this paper, we propose a novel display calibration method to fix the situation. In our method, a pattern image is displayed on the panel and a camera takes its pictures twice at different positions. Then, based on a quantitative model, we extract all display parameters (i.e., pitch, slanted angle, gap or thickness, offset) from the observed patterns in the captured images. For high accuracy and robustness, our method analyzes the patterns mostly in frequency domain. We conduct two types of experiments for validation; one with optical simulation for quantitative results and the other with real-life displays for qualitative assessment. Experimental results demonstrate that our method is quite accurate, about a half order of magnitude higher than prior work; is efficient, spending less than 2 s for computation; and is robust to noise, working well in the SNR regime as low as 6 dB.