11 Papers

RONov 18, 2022Code
Rationale-aware Autonomous Driving Policy utilizing Safety Force Field implemented on CARLA Simulator

Ho Suk, Taewoo Kim, Hyungbin Park et al.

Despite the rapid improvement of autonomous driving technology in recent years, automotive manufacturers must resolve liability issues to commercialize autonomous passenger car of SAE J3016 Level 3 or higher. To cope with the product liability law, manufacturers develop autonomous driving systems in compliance with international standards for safety such as ISO 26262 and ISO 21448. Concerning the safety of the intended functionality (SOTIF) requirement in ISO 26262, the driving policy recommends providing an explicit rational basis for maneuver decisions. In this case, mathematical models such as Safety Force Field (SFF) and Responsibility-Sensitive Safety (RSS) which have interpretability on decision, may be suitable. In this work, we implement SFF from scratch to substitute the undisclosed NVIDIA's source code and integrate it with CARLA open-source simulator. Using SFF and CARLA, we present a predictor for claimed sets of vehicles, and based on the predictor, propose an integrated driving policy that consistently operates regardless of safety conditions it encounters while passing through dynamic traffic. The policy does not have a separate plan for each condition, but using safety potential, it aims human-like driving blended in with traffic flow.

26.5CVMay 4Code
SpectraDINO: Bridging the Spectral Gap in Vision Foundation Models via Lightweight Adapters

Yagiz Nalcakan, Hyeongjin Ju, Incheol Park et al.

Vision Foundation Models (VFMs) pretrained on large-scale RGB data have demonstrated remarkable representation quality, yet their applicability to multispectral imaging spanning Near-Infrared (NIR), Short-Wave Infrared (SWIR), and Long-Wave Infrared (LWIR) remains largely unexplored. These spectral modalities offer complementary sensing capabilities critical for robust perception in adverse conditions, but present a fundamental domain gap relative to RGB-centric pretrained models. We present SpectraDINO, a multispectral VFM that bridges this spectral gap by extending DINOv2 ViT backbones to beyond-visible modalities through lightweight, per-modality bottleneck adapters, while preserving the rich representations of the frozen RGB backbone. We introduce a multi-stage teacher-student training protocol in which a frozen DINOv2 teacher guides a spectral student via cosine distillation, symmetric contrastive loss, patch-level alignment, and a novel neighborhood-structure-preservation loss. This staged curriculum enables strong cross-modal alignment without catastrophic forgetting of RGB priors. We evaluate SpectraDINO on multispectral object detection and semantic segmentation across challenging NIR, SWIR, and LWIR benchmarks using widely adopted fusion strategies. SpectraDINO achieves state-of-the-art performance across most benchmarks, validating its effectiveness as a general-purpose backbone for spectral generalization. The code and weights for model variants are available at https://github.com/Yonsei-STL/SpectraDINO.

CVSep 25, 2024
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation

Youngwan Jin, Incheol Park, Hanbin Song et al.

This paper proposes Pix2Next, a novel image-to-image translation framework designed to address the challenge of generating high-quality Near-Infrared (NIR) images from RGB inputs. Our approach leverages a state-of-the-art Vision Foundation Model (VFM) within an encoder-decoder architecture, incorporating cross-attention mechanisms to enhance feature integration. This design captures detailed global representations and preserves essential spectral characteristics, treating RGB-to-NIR translation as more than a simple domain transfer problem. A multi-scale PatchGAN discriminator ensures realistic image generation at various detail levels, while carefully designed loss functions couple global context understanding with local feature preservation. We performed experiments on the RANUS dataset to demonstrate Pix2Next's advantages in quantitative metrics and visual quality, improving the FID score by 34.81% compared to existing methods. Furthermore, we demonstrate the practical utility of Pix2Next by showing improved performance on a downstream object detection task using generated NIR data to augment limited real NIR datasets. The proposed approach enables the scaling up of NIR datasets without additional data acquisition or annotation efforts, potentially accelerating advancements in NIR-based computer vision applications.

CVFeb 17
RPT-SR: Regional Prior attention Transformer for infrared image Super-Resolution

Youngwan Jin, Incheol Park, Yagiz Nalcakan et al.

General-purpose super-resolution models, particularly Vision Transformers, have achieved remarkable success but exhibit fundamental inefficiencies in common infrared imaging scenarios like surveillance and autonomous driving, which operate from fixed or nearly-static viewpoints. These models fail to exploit the strong, persistent spatial priors inherent in such scenes, leading to redundant learning and suboptimal performance. To address this, we propose the Regional Prior attention Transformer for infrared image Super-Resolution (RPT-SR), a novel architecture that explicitly encodes scene layout information into the attention mechanism. Our core contribution is a dual-token framework that fuses (1) learnable, regional prior tokens, which act as a persistent memory for the scene's global structure, with (2) local tokens that capture the frame-specific content of the current input. By utilizing these tokens into an attention, our model allows the priors to dynamically modulate the local reconstruction process. Extensive experiments validate our approach. While most prior works focus on a single infrared band, we demonstrate the broad applicability and versatility of RPT-SR by establishing new state-of-the-art performance across diverse datasets covering both Long-Wave (LWIR) and Short-Wave (SWIR) spectra

QUANT-PHJan 10, 2025
Q-MAML: Quantum Model-Agnostic Meta-Learning for Variational Quantum Algorithms

Junyong Lee, JeiHee Cho, Shiho Kim

In the Noisy Intermediate-Scale Quantum (NISQ) era, using variational quantum algorithms (VQAs) to solve optimization problems has become a key application. However, these algorithms face significant challenges, such as choosing an effective initial set of parameters and the limited quantum processing time that restricts the number of optimization iterations. In this study, we introduce a new framework for optimizing parameterized quantum circuits (PQCs) that employs a classical optimizer, inspired by Model-Agnostic Meta-Learning (MAML) technique. This approach aim to achieve better parameter initialization that ensures fast convergence. Our framework features a classical neural network, called Learner}, which interacts with a PQC using the output of Learner as an initial parameter. During the pre-training phase, Learner is trained with a meta-objective based on the quantum circuit cost function. In the adaptation phase, the framework requires only a few PQC updates to converge to a more accurate value, while the learner remains unchanged. This method is highly adaptable and is effectively extended to various Hamiltonian optimization problems. We validate our approach through experiments, including distribution function mapping and optimization of the Heisenberg XYZ Hamiltonian. The result implies that the Learner successfully estimates initial parameters that generalize across the problem space, enabling fast adaptation.

QUANT-PHMar 17, 2025
Enhancing Circuit Trainability with Selective Gate Activation Strategy

Jeihee Cho, Junyong Lee, Daniel Justice et al.

Hybrid quantum-classical computing relies heavily on Variational Quantum Algorithms (VQAs) to tackle challenges in diverse fields like quantum chemistry and machine learning. However, VQAs face a critical limitation: the balance between circuit trainability and expressibility. Trainability, the ease of optimizing circuit parameters for problem-solving, is often hampered by the Barren Plateau, where gradients vanish and hinder optimization. On the other hand, increasing expressibility, the ability to represent a wide range of quantum states, often necessitates deeper circuits with more parameters, which in turn exacerbates trainability issues. In this work, we investigate selective gate activation strategies as a potential solution to these challenges within the context of Variational Quantum Eigensolvers (VQEs). We evaluate three different approaches: activating gates randomly without considering their type or parameter magnitude, activating gates randomly but limited to a single gate type, and activating gates based on the magnitude of their parameter values. Experiment results reveal that the Magnitude-based strategy surpasses other methods, achieving improved convergence.

CVApr 10, 2025
RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions

Youngwan Jin, Michal Kovac, Yagiz Nalcakan et al.

Current autonomous driving algorithms heavily rely on the visible spectrum, which is prone to performance degradation in adverse conditions like fog, rain, snow, glare, and high contrast. Although other spectral bands like near-infrared (NIR) and long-wave infrared (LWIR) can enhance vision perception in such situations, they have limitations and lack large-scale datasets and benchmarks. Short-wave infrared (SWIR) imaging offers several advantages over NIR and LWIR. However, no publicly available large-scale datasets currently incorporate SWIR data for autonomous driving. To address this gap, we introduce the RGB and SWIR Multispectral Driving (RASMD) dataset, which comprises 100,000 synchronized and spatially aligned RGB-SWIR image pairs collected across diverse locations, lighting, and weather conditions. In addition, we provide a subset for RGB-SWIR translation and object detection annotations for a subset of challenging traffic scenarios to demonstrate the utility of SWIR imaging through experiments on both object detection and RGB-to-SWIR image translation. Our experiments show that combining RGB and SWIR data in an ensemble framework significantly improves detection accuracy compared to RGB-only approaches, particularly in conditions where visible-spectrum sensors struggle. We anticipate that the RASMD dataset will advance research in multispectral imaging for autonomous driving and robust perception systems.

LGFeb 17, 2022
A Survey on Deep Reinforcement Learning-based Approaches for Adaptation and Generalization

Pamul Yadav, Ashutosh Mishra, Junyong Lee et al.

Deep Reinforcement Learning (DRL) aims to create intelligent agents that can learn to solve complex problems efficiently in a real-world environment. Typically, two learning goals: adaptation and generalization are used for baselining DRL algorithm's performance on different tasks and domains. This paper presents a survey on the recent developments in DRL-based approaches for adaptation and generalization. We begin by formulating these goals in the context of task and domain. Then we review the recent works under those approaches and discuss future research directions through which DRL algorithms' adaptability and generalizability can be enhanced and potentially make them applicable to a broad range of real-world problems.

SYJan 12, 2020
Self-Driving like a Human driver instead of a Robocar: Personalized comfortable driving experience for autonomous vehicles

Il Bae, Jaeyoung Moon, Junekyo Jhung et al.

This paper issues an integrated control system of self-driving autonomous vehicles based on the personal driving preference to provide personalized comfortable driving experience to autonomous vehicle users. We propose an Occupant's Preference Metric (OPM) which is defining a preferred lateral and longitudinal acceleration region with maximum allowable jerk for users. Moreover, we propose a vehicle controller based on control parameters enabling integrated lateral and longitudinal control via preference-aware maneuvering of autonomous vehicles. The proposed system not only provides the criteria for the occupant's driving preference, but also provides a personalized autonomous self-driving style like a human driver instead of a Robocar. The simulation and experimental results demonstrated that the proposed system can maneuver the self-driving vehicle like a human driver by tracking the specified criterion of admissible acceleration and jerk.

CRJul 24, 2017
Intelligent Vehicle-Trust Point: Reward based Intelligent Vehicle Communication using Blockchain

Madhusudan Singh, Shiho Kim

The Intelligent vehicle (IV) is experiencing revolutionary growth in research and industry, but it still suffers from many security vulnerabilities. Traditional security methods are incapable to provide secure IV communication. The major issues in IV communication, are trust, data accuracy and reliability of communication data in the communication channel. Blockchain technology works for the crypto currency, Bit-coin, which is recently used to build trust and reliability in peer-to-peer networks having similar topologies as IV Communication. In this paper, we are proposing, Intelligent Vehicle-Trust Point (IV-TP) mechanism for IV communication among IVs using Blockchain technology. The IVs communicated data provides security and reliability using our proposed IV-TP. Our IV-TP mechanism provides trustworthiness for vehicles behavior, and vehicles legal and illegal action. Our proposal presents a reward based system, an exchange of some IV-TP among IVs, during successful communication. For the data management of the IV-TP, we are using blockchain technology in the intelligent transportation system (ITS), which stores all IV-TP details of every vehicle and is accessed ubiquitously by IVs. In this paper, we evaluate our proposal with the help of intersection use case scenario for intelligent vehicles communication.

CRJul 20, 2017
Blockchain Based Intelligent Vehicle Data sharing Framework

Madhusudan Singh, Shiho Kim

The Intelligent vehicle (IV) is experiencing revolutionary growth in research and industry, but it still suffers from many security vulnerabilities. Traditional security methods are incapable to provide secure IV data sharing. The major issues in IV data sharing are trust, data accuracy and reliability of data sharing data in the communication channel. Blockchain technology works for the crypto currency, Bit-coin, which is recently used to build trust and reliability in peer-to-peer networks having similar topologies as IV Data sharing. In this paper, we have proposed Intelligent Vehicle data sharing we are proposing a trust environment based Intelligent Vehicle framework. In proposed framework, we have use the blockchain technology as backbone of the IV data-sharing environment. The blockchain technology is provide the trust environment between the vehicles with the based on proof of driving.