CVOct 19, 2022
Segmentation-free Direct Iris Localization NetworksTakahiro Toizumi, Koichi Takahashi, Masato Tsukada
This paper proposes an efficient iris localization method without using iris segmentation and circle fitting. Conventional iris localization methods first extract iris regions by using semantic segmentation methods such as U-Net. Afterward, the inner and outer iris circles are localized using the traditional circle fitting algorithm. However, this approach requires high-resolution encoder-decoder networks for iris segmentation, so it causes computational costs to be high. In addition, traditional circle fitting tends to be sensitive to noise in input images and fitting parameters, causing the iris recognition performance to be poor. To solve these problems, we propose an iris localization network (ILN), that can directly localize pupil and iris circles with eyelid points from a low-resolution iris image. We also introduce a pupil refinement network (PRN) to improve the accuracy of pupil localization. Experimental results show that the combination of ILN and PRN works in 34.5 ms for one iris image on a CPU, and its localization performance outperforms conventional iris segmentation methods. In addition, generalized evaluation results show that the proposed method has higher robustness for datasets in different domain than other segmentation methods. Furthermore, we also confirm that the proposed ILN and PRN improve the iris recognition accuracy.
16.5CVApr 17
IA-CLAHE: Image-Adaptive Clip Limit Estimation for CLAHERikuto Otsuka, Yuho Shoji, Yuka Ogino et al.
This paper proposes image-adaptive contrast limited adaptive histogram equalization (IA-CLAHE). Conventional CLAHE is widely used to boost the performance of various computer vision tasks and to improve visual quality for human perception in practical industrial applications. CLAHE applies contrast limited histogram equalization to each local region to enhance local contrast. However, CLAHE often leads to over-enhancement, because the contrast-limiting parameter clip limit is fixed regardless of the histogram distribution of each local region. Our IA-CLAHE addresses this limitation by adaptively estimating tile-wise clip limits from the input image. To achieve this, we train a lightweight clip limits estimator with a differentiable extension of CLAHE, enabling end-to-end optimization. Unlike prior learning-based CLAHE methods, IA-CLAHE does not require pre-searched ground-truth clip limits or task-specific datasets, because it learns to map input image histograms toward a domain-invariant uniform distribution, enabling zero-shot generalization across diverse conditions. Experimental results show that IA-CLAHE consistently improves recognition performance, while simultaneously enhancing visual quality for human perception, without requiring any task-specific training data.
CVJul 11, 2024
Adaptive Deep Iris Feature Extractor at Arbitrary ResolutionsYuho Shoji, Yuka Ogino, Takahiro Toizumi et al.
This paper proposes a deep feature extractor for iris recognition at arbitrary resolutions. Resolution degradation reduces the recognition performance of deep learning models trained by high-resolution images. Using various-resolution images for training can improve the model's robustness while sacrificing recognition performance for high-resolution images. To achieve higher recognition performance at various resolutions, we propose a method of resolution-adaptive feature extraction with automatically switching networks. Our framework includes resolution expert modules specialized for different resolution degradations, including down-sampling and out-of-focus blurring. The framework automatically switches them depending on the degradation condition of an input image. Lower-resolution experts are trained by knowledge-distillation from the high-resolution expert in such a manner that both experts can extract common identity features. We applied our framework to three conventional neural network models. The experimental results show that our method enhances the recognition performance at low-resolution in the conventional methods and also maintains their performance at high-resolution.
CVNov 5, 2024
ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive ProcessingYuka Ogino, Yuho Shoji, Takahiro Toizumi et al.
We propose an image-adaptive object detection method for adverse weather conditions such as fog and low-light. Our framework employs differentiable preprocessing filters to perform image enhancement suitable for later-stage object detections. Our framework introduces two differentiable filters: a Bézier curve-based pixel-wise (BPW) filter and a kernel-based local (KBL) filter. These filters unify the functions of classical image processing filters and improve performance of object detection. We also propose a domain-agnostic data augmentation strategy using the BPW filter. Our method does not require data-specific customization of the filter combinations, parameter ranges, and data augmentation. We evaluate our proposed approach, called Enhanced Robustness by Unified Image Processing (ERUP)-YOLO, by applying it to the YOLOv3 detector. Experiments on adverse weather datasets demonstrate that our proposed filters match or exceed the expressiveness of conventional methods and our ERUP-YOLO achieved superior performance in a wide range of adverse weather conditions, including fog and low-light conditions.
CVJan 12, 2024
Improving Low-Light Image Recognition Performance Based on Image-adaptive Learnable ModuleSeitaro Ono, Yuka Ogino, Takahiro Toizumi et al.
In recent years, significant progress has been made in image recognition technology based on deep neural networks. However, improving recognition performance under low-light conditions remains a significant challenge. This study addresses the enhancement of recognition model performance in low-light conditions. We propose an image-adaptive learnable module which apply appropriate image processing on input images and a hyperparameter predictor to forecast optimal parameters used in the module. Our proposed approach allows for the enhancement of recognition performance under low-light conditions by easily integrating as a front-end filter without the need to retrain existing recognition models designed for low-light conditions. Through experiments, our proposed method demonstrates its contribution to enhancing image recognition performance under low-light conditions.
CVJan 8, 2025
Recognition-Oriented Low-Light Image Enhancement based on Global and Pixelwise OptimizationSeitaro Ono, Yuka Ogino, Takahiro Toizumi et al.
In this paper, we propose a novel low-light image enhancement method aimed at improving the performance of recognition models. Despite recent advances in deep learning, the recognition of images under low-light conditions remains a challenge. Although existing low-light image enhancement methods have been developed to improve image visibility for human vision, they do not specifically focus on enhancing recognition model performance. Our proposed low-light image enhancement method consists of two key modules: the Global Enhance Module, which adjusts the overall brightness and color balance of the input image, and the Pixelwise Adjustment Module, which refines image features at the pixel level. These modules are trained to enhance input images to improve downstream recognition model performance effectively. Notably, the proposed method can be applied as a frontend filter to improve low-light recognition performance without requiring retraining of downstream recognition models. Experimental results demonstrate that our method improves the performance of pretrained recognition models under low-light conditions and its effectiveness.
CVJun 2, 2025
Rethinking Image Histogram Matching for Image ClassificationRikuto Otsuka, Yuho Shoji, Yuka Ogino et al.
This paper rethinks image histogram matching (HM) and proposes a differentiable and parametric HM preprocessing for a downstream classifier. Convolutional neural networks have demonstrated remarkable achievements in classification tasks. However, they often exhibit degraded performance on low-contrast images captured under adverse weather conditions. To maintain classifier performance under low-contrast images, histogram equalization (HE) is commonly used. HE is a special case of HM using a uniform distribution as a target pixel value distribution. In this paper, we focus on the shape of the target pixel value distribution. Compared to a uniform distribution, a single, well-designed distribution could have potential to improve the performance of the downstream classifier across various adverse weather conditions. Based on this hypothesis, we propose a differentiable and parametric HM that optimizes the target distribution using the loss function of the downstream classifier. This method addresses pixel value imbalances by transforming input images with arbitrary distributions into a target distribution optimized for the classifier. Our HM is trained on only normal weather images using the classifier. Experimental results show that a classifier trained with our proposed HM outperforms conventional preprocessing methods under adverse weather conditions.
CVJun 2, 2025
Target Driven Adaptive Loss For Infrared Small Target DetectionYuho Shoji, Takahiro Toizumi, Atsushi Ito
We propose a target driven adaptive (TDA) loss to enhance the performance of infrared small target detection (IRSTD). Prior works have used loss functions, such as binary cross-entropy loss and IoU loss, to train segmentation models for IRSTD. Minimizing these loss functions guides models to extract pixel-level features or global image context. However, they have two issues: improving detection performance for local regions around the targets and enhancing robustness to small scale and low local contrast. To address these issues, the proposed TDA loss introduces a patch-based mechanism, and an adaptive adjustment strategy to scale and local contrast. The proposed TDA loss leads the model to focus on local regions around the targets and pay particular attention to targets with smaller scales and lower local contrast. We evaluate the proposed method on three datasets for IRSTD. The results demonstrate that the proposed TDA loss achieves better detection performance than existing losses on these datasets.
CVMay 29, 2025
CURVE: CLIP-Utilized Reinforcement Learning for Visual Image Enhancement via Simple Image ProcessingYuka Ogino, Takahiro Toizumi, Atsushi Ito
Low-Light Image Enhancement (LLIE) is crucial for improving both human perception and computer vision tasks. This paper addresses two challenges in zero-reference LLIE: obtaining perceptually 'good' images using the Contrastive Language-Image Pre-Training (CLIP) model and maintaining computational efficiency for high-resolution images. We propose CLIP-Utilized Reinforcement learning-based Visual image Enhancement (CURVE). CURVE employs a simple image processing module which adjusts global image tone based on Bézier curve and estimates its processing parameters iteratively. The estimator is trained by reinforcement learning with rewards designed using CLIP text embeddings. Experiments on low-light and multi-exposure datasets demonstrate the performance of CURVE in terms of enhancement quality and processing speed compared to conventional methods.
CVFeb 22, 2022
Fast Eye Detector Using Siamese Network for NIR Partial Face ImagesYuka Ogino, Yuho Shoji, Takahiro Toizumi et al.
This paper proposes a fast eye detection method that is based on a Siamese network for near infrared (NIR) partial face images. NIR partial face images do not include the whole face of a subject since they are captured using iris recognition systems with the constraint of frame rate and resolution. The iris recognition systems such as the iris on the move (IOTM) system require fast and accurate eye detection as a pre-process. Our goal is to design eye detection with high speed, high discrimination performance between left and right eyes, and high positional accuracy of eye center. Our method adopts a Siamese network and coarse to fine position estimation with a fast lightweight CNN backbone. The network outputs features of images and the similarity map indicating coarse position of an eye. A regression on a portion of a feature with high similarity refines the coarse position of the eye to obtain the fine position with high accuracy. We demonstrate the effectiveness of the proposed method by comparing it with conventional methods, including SOTA, in terms of the positional accuracy, the discrimination performance, and the processing speed. Our method achieves superior performance in speed.
CVFeb 6, 2018
Rollable Latent Space for Azimuth Invariant SAR Target RecognitionKazutoshi Sagi, Takahiro Toizumi, Yuzo Senda
This paper proposes rollable latent space (RLS) for an azimuth invariant synthetic aperture radar (SAR) target recognition. Scarce labeled data and limited viewing direction are critical issues in SAR target recognition.The RLS is a designed space in which rolling of latent features corresponds to 3D rotation of an object. Thus latent features of an arbitrary view can be inferred using those of different views. This characteristic further enables us to augment data from limited viewing in RLS. RLS-based classifiers with and without data augmentation and a conventional classifier trained with target front shots are evaluated over untrained target back shots. Results show that the RLS-based classifier with augmentation improves an accuracy by 30% compared to the conventional classifier.