Woo-han Yun

ROFeb 11, 2025Code

Space-Aware Instruction Tuning: Dataset and Benchmark for Guide Dog Robots Assisting the Visually Impaired

ByungOk Han, Woo-han Yun, Beom-Su Seo et al.

Guide dog robots offer promising solutions to enhance mobility and safety for visually impaired individuals, addressing the limitations of traditional guide dogs, particularly in perceptual intelligence and communication. With the emergence of Vision-Language Models (VLMs), robots are now capable of generating natural language descriptions of their surroundings, aiding in safer decision-making. However, existing VLMs often struggle to accurately interpret and convey spatial relationships, which is crucial for navigation in complex environments such as street crossings. We introduce the Space-Aware Instruction Tuning (SAIT) dataset and the Space-Aware Benchmark (SA-Bench) to address the limitations of current VLMs in understanding physical environments. Our automated data generation pipeline focuses on the virtual path to the destination in 3D space and the surroundings, enhancing environmental comprehension and enabling VLMs to provide more accurate guidance to visually impaired individuals. We also propose an evaluation protocol to assess VLM effectiveness in delivering walking guidance. Comparative experiments demonstrate that our space-aware instruction-tuned model outperforms state-of-the-art algorithms. We have fully open-sourced the SAIT dataset and SA-Bench, along with the related code, at https://github.com/byungokhan/Space-awareVLM

ROSep 26, 2019

Cut-and-Paste Dataset Generation for Balancing Domain Gaps in Object Instance Detection

Woo-han Yun, Taewoo Kim, Jaeyeon Lee et al.

Training an object instance detector where only a few training object images are available is a challenging task. One solution is a cut-and-paste method that generates a training dataset by cutting object areas out of training images and pasting them onto other background images. A detector trained on a dataset generated with a cut-and-paste method suffers from the conventional domain shift problem, which stems from a discrepancy between the source domain (generated training dataset) and the target domain (real test dataset). Though state-of-the-art domain adaptation methods are able to reduce this gap, it is limited because they do not consider the difference of domain gaps of foreground and background. In this study, we present that the conventional domain gap can be divided into two sub-domain gaps for foreground and background. Then, we show that the original cut-and-paste approach suffers from a new domain gap problem, an unbalanced domain gaps, because it has two separate source domains for foreground and background, unlike the conventional domain shift problem. Then, we introduce an advanced cut-and-paste method to balance the unbalanced domain gaps by diversifying the foreground with GAN (generative adversarial network)-generated seed images and simplifying the background using image processing techniques. Experimental results show that our method is effective for balancing domain gaps and improving the accuracy of object instance detection in a cluttered indoor environment using only a few seed images. Furthermore, we show that balancing domain gaps can improve the detection accuracy of state-of-the-art domain adaptation methods.

Woo-han Yun

2 Papers