CVApr 24, 2023Code
A Forward and Backward Compatible Framework for Few-shot Class-incremental Pill RecognitionJinghua Zhang, Li Liu, Kai Gao et al.
Automatic Pill Recognition (APR) systems are crucial for enhancing hospital efficiency, assisting visually impaired individuals, and preventing cross-infection. However, most existing deep learning-based pill recognition systems can only perform classification on classes with sufficient training data. In practice, the high cost of data annotation and the continuous increase in new pill classes necessitate the development of a few-shot class-incremental pill recognition system. This paper introduces the first few-shot class-incremental pill recognition framework, named Discriminative and Bidirectional Compatible Few-Shot Class-Incremental Learning (DBC-FSCIL). It encompasses forward-compatible and backward-compatible learning components. In forward-compatible learning, we propose an innovative virtual class synthesis strategy and a Center-Triplet (CT) loss to enhance discriminative feature learning. These virtual classes serve as placeholders in the feature space for future class updates, providing diverse semantic knowledge for model training. For backward-compatible learning, we develop a strategy to synthesize reliable pseudo-features of old classes using uncertainty quantification, facilitating Data Replay (DR) and Knowledge Distillation (KD). This approach allows for the flexible synthesis of features and effectively reduces additional storage requirements for samples and models. Additionally, we construct a new pill image dataset for FSCIL and assess various mainstream FSCIL methods, establishing new benchmarks. Our experimental results demonstrate that our framework surpasses existing State-of-the-art (SOTA) methods. The code is available at https://github.com/zhang-jinghua/DBC-FSCIL.
CVApr 4, 2022
An application of Pixel Interval Down-sampling (PID) for dense tiny microorganism counting on environmental microorganism imagesJiawei Zhang, Xin Zhao, Tao Jiang et al.
This paper proposes a novel pixel interval down-sampling network (PID-Net) for dense tiny object (yeast cells) counting tasks with higher accuracy. The PID-Net is an end-to-end convolutional neural network (CNN) model with an encoder--decoder architecture. The pixel interval down-sampling operations are concatenated with max-pooling operations to combine the sparse and dense features. This addresses the limitation of contour conglutination of dense objects while counting. The evaluation was conducted using classical segmentation metrics (the Dice, Jaccard and Hausdorff distance) as well as counting metrics. The experimental results show that the proposed PID-Net had the best performance and potential for dense tiny object counting tasks, which achieved 96.97\% counting accuracy on the dataset with 2448 yeast cell images. By comparing with the state-of-the-art approaches, such as Attention U-Net, Swin U-Net and Trans U-Net, the proposed PID-Net can segment dense tiny objects with clearer boundaries and fewer incorrect debris, which shows the great potential of PID-Net in the task of accurate counting.
LGAug 13, 2023
Few-shot Class-incremental Learning for Classification and Object Detection: A SurveyJinghua Zhang, Li Liu, Olli Silvén et al.
Few-shot Class-Incremental Learning (FSCIL) presents a unique challenge in Machine Learning (ML), as it necessitates the Incremental Learning (IL) of new classes from sparsely labeled training samples without forgetting previous knowledge. While this field has seen recent progress, it remains an active exploration area. This paper aims to provide a comprehensive and systematic review of FSCIL. In our in-depth examination, we delve into various facets of FSCIL, encompassing the problem definition, the discussion of the primary challenges of unreliable empirical risk minimization and the stability-plasticity dilemma, general schemes, and relevant problems of IL and Few-shot Learning (FSL). Besides, we offer an overview of benchmark datasets and evaluation metrics. Furthermore, we introduce the Few-shot Class-incremental Classification (FSCIC) methods from data-based, structure-based, and optimization-based approaches and the Few-shot Class-incremental Object Detection (FSCIOD) methods from anchor-free and anchor-based approaches. Beyond these, we present several promising research directions within FSCIL that merit further investigation.
CVAug 29, 2022
Artificial Neural Networks for Finger Vein Recognition: A SurveyYimin Yin, Renye Zhang, Pengfei Liu et al.
Finger vein recognition is an emerging biometric recognition technology. Different from the other biometric features on the body surface, the venous vascular tissue of the fingers is buried deep inside the skin. Due to this advantage, finger vein recognition is highly stable and private. They are almost impossible to be stolen and difficult to interfere with by external conditions. Unlike the finger vein recognition methods based on traditional machine learning, the artificial neural network technique, especially deep learning, it without relying on feature engineering and have superior performance. To summarize the development of finger vein recognition based on artificial neural networks, this paper collects 149 related papers. First, we introduce the background of finger vein recognition and the motivation of this survey. Then, the development history of artificial neural networks and the representative networks on finger vein recognition tasks are introduced. The public datasets that are widely used in finger vein recognition are then described. After that, we summarize the related finger vein recognition tasks based on classical neural networks and deep neural networks, respectively. Finally, the challenges and potential development directions in finger vein recognition are discussed. To our best knowledge, this paper is the first comprehensive survey focusing on finger vein recognition based on artificial neural networks.
CVJan 15, 2023
ACTIVE: A Deep Model for Sperm and Impurity Detection in Microscopic VideosAo Chen, Jinghua Zhang, Md Mamunur Rahaman et al.
The accurate detection of sperms and impurities is a very challenging task, facing problems such as the small size of targets, indefinite target morphologies, low contrast and resolution of the video, and similarity of sperms and impurities. So far, the detection of sperms and impurities still largely relies on the traditional image processing and detection techniques which only yield limited performance and often require manual intervention in the detection process, therefore unfavorably escalating the time cost and injecting the subjective bias into the analysis. Encouraged by the successes of deep learning methods in numerous object detection tasks, here we report a deep learning model based on Double Branch Feature Extraction Network (DBFEN) and Cross-conjugate Feature Pyramid Networks (CCFPN).DBFEN is designed to extract visual features from tiny objects with a double branch structure, and CCFPN is further introduced to fuse the features extracted by DBFEN to enhance the description of position and high-level semantic information. Our work is the pioneer of introducing deep learning approaches to the detection of sperms and impurities. Experiments show that the highest AP50 of the sperm and impurity detection is 91.13% and 59.64%, which lead its competitors by a substantial margin and establish new state-of-the-art results in this problem.
CVJul 5, 2022
Deep Learning for Finger Vein Recognition: A Brief Survey of Recent TrendRenye Zhang, Yimin Yin, Wanxia Deng et al.
Finger vein image recognition technology plays an important role in biometric recognition and has been successfully applied in many fields. Because veins are buried beneath the skin tissue, finger vein image recognition has an unparalleled advantage, which is not easily disturbed by external factors. This review summarizes 46 papers about deep learning for finger vein image recognition from 2017 to 2021. These papers are summarized according to the tasks of deep neural networks. Besides, we present the challenges and potential development directions of finger vein image recognition.
CVMar 15, 2023
Deep Learning for Iris Recognition: A ReviewYimin Yin, Siliang He, Renye Zhang et al.
Iris recognition is a secure biometric technology known for its stability and privacy. With no two irises being identical and little change throughout a person's lifetime, iris recognition is considered more reliable and less susceptible to external factors than other biometric recognition methods. Unlike traditional machine learning-based iris recognition methods, deep learning technology does not rely on feature engineering and boasts excellent performance. This paper collects 120 relevant papers to summarize the development of iris recognition based on deep learning. We first introduce the background of iris recognition and the motivation and contribution of this survey. Then, we present the common datasets widely used in iris recognition. After that, we summarize the key tasks involved in the process of iris recognition based on deep learning technology, including identification, segmentation, presentation attack detection, and localization. Finally, we discuss the challenges and potential development of iris recognition. This review provides a comprehensive sight of the research of iris recognition based on deep learning.
AIApr 29, 2025
Partitioned Memory Storage Inspired Few-Shot Class-Incremental learningRenye Zhang, Yimin Yin, Jinghua Zhang
Current mainstream deep learning techniques exhibit an over-reliance on extensive training data and a lack of adaptability to the dynamic world, marking a considerable disparity from human intelligence. To bridge this gap, Few-Shot Class-Incremental Learning (FSCIL) has emerged, focusing on continuous learning of new categories with limited samples without forgetting old knowledge. Existing FSCIL studies typically use a single model to learn knowledge across all sessions, inevitably leading to the stability-plasticity dilemma. Unlike machines, humans store varied knowledge in different cerebral cortices. Inspired by this characteristic, our paper aims to develop a method that learns independent models for each session. It can inherently prevent catastrophic forgetting. During the testing stage, our method integrates Uncertainty Quantification (UQ) for model deployment. Our method provides a fresh viewpoint for FSCIL and demonstrates the state-of-the-art performance on CIFAR-100 and mini-ImageNet datasets.
CVFeb 18, 2022
A Comprehensive Survey with Quantitative Comparison of Image Analysis Methods for Microorganism Biovolume MeasurementsJiawei Zhang, Chen Li, Md Mamunur Rahaman et al.
With the acceleration of urbanization and living standards, microorganisms play increasingly important roles in industrial production, bio-technique, and food safety testing. Microorganism biovolume measurements are one of the essential parts of microbial analysis. However, traditional manual measurement methods are time-consuming and challenging to measure the characteristics precisely. With the development of digital image processing techniques, the characteristics of the microbial population can be detected and quantified. The changing trend can be adjusted in time and provided a basis for the improvement. The applications of the microorganism biovolume measurement method have developed since the 1980s. More than 62 articles are reviewed in this study, and the articles are grouped by digital image segmentation methods with periods. This study has high research significance and application value, which can be referred to microbial researchers to have a comprehensive understanding of microorganism biovolume measurements using digital image analysis methods and potential applications.
CVAug 1, 2021
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to Popular Convolutional Neural Network and Potential Visual TransformerJinghua Zhang, Chen Li, Yimin Yin et al.
Microorganisms are widely distributed in the human daily living environment. They play an essential role in environmental pollution control, disease prevention and treatment, and food and drug production. The analysis of microorganisms is essential for making full use of different microorganisms. The conventional analysis methods are laborious and time-consuming. Therefore, the automatic image analysis based on artificial neural networks is introduced to optimize it. However, the automatic microorganism image analysis faces many challenges, such as the requirement of a robust algorithm caused by various application occasions, insignificant features and easy under-segmentation caused by the image characteristic, and various analysis tasks. Therefore, we conduct this review to comprehensively discuss the characteristics of microorganism image analysis based on artificial neural networks. In this review, the background and motivation are introduced first. Then, the development of artificial neural networks and representative networks are presented. After that, the papers related to microorganism image analysis based on classical and deep neural networks are reviewed from the perspectives of different tasks. In the end, the methodology analysis and potential direction are discussed.
CVJun 22, 2021
A Comparison for Patch-level Classification of Deep Learning Methods on Transparent Environmental Microorganism Images: from Convolutional Neural Networks to Visual TransformersHechen Yang, Chen Li, Jinghua Zhang et al.
Nowadays, analysis of Transparent Environmental Microorganism Images (T-EM images) in the field of computer vision has gradually become a new and interesting spot. This paper compares different deep learning classification performance for the problem that T-EM images are challenging to analyze. We crop the T-EM images into 8 * 8 and 224 * 224 pixel patches in the same proportion and then divide the two different pixel patches into foreground and background according to ground truth. We also use four convolutional neural networks and a novel ViT network model to compare the foreground and background classification experiments. We conclude that ViT performs the worst in classifying 8 * 8 pixel patches, but it outperforms most convolutional neural networks in classifying 224 * 224 pixel patches.
CVFeb 24, 2021
A New Pairwise Deep Learning Feature For Environmental Microorganism Image AnalysisFrank Kulwa, Chen Li, Jinghua Zhang et al.
Environmental microorganism (EM) offers a high-efficient, harmless, and low-cost solution to environmental pollution. They are used in sanitation, monitoring, and decomposition of environmental pollutants. However, this depends on the proper identification of suitable microorganisms. In order to fasten, low the cost, increase consistency and accuracy of identification, we propose the novel pairwise deep learning features to analyze microorganisms. The pairwise deep learning features technique combines the capability of handcrafted and deep learning features. In this technique we, leverage the Shi and Tomasi interest points by extracting deep learning features from patches which are centered at interest points locations. Then, to increase the number of potential features that have intermediate spatial characteristics between nearby interest points, we use Delaunay triangulation theorem and straight-line geometric theorem to pair the nearby deep learning features. The potential of pairwise features is justified on the classification of EMs using SVMs, k-NN, and Random Forest classifier. The pairwise features obtain outstanding results of 99.17%, 91.34%, 91.32%, 91.48%, and 99.56%, which are the increase of about 5.95%, 62.40%, 62.37%, 61.84%, and 3.23% in accuracy, F1-score, recall, precision, and specificity respectively, compared to non-paired deep learning features.
CVFeb 20, 2021
EMDS-5: Environmental Microorganism Image Dataset Fifth Version for Multiple Image Analysis TasksZihan Li, Chen Li, Yudong Yao et al.
Environmental Microorganism Data Set Fifth Version (EMDS-5) is a microscopic image dataset including original Environmental Microorganism (EM) images and two sets of Ground Truth (GT) images. The GT image sets include a single-object GT image set and a multi-object GT image set. The EMDS-5 dataset has 21 types of EMs, each of which contains 20 original EM images, 20 single-object GT images and 20 multi-object GT images. EMDS-5 can realize to evaluate image preprocessing, image segmentation, feature extraction, image classification and image retrieval functions. In order to prove the effectiveness of EMDS-5, for each function, we select the most representative algorithms and price indicators for testing and evaluation. The image preprocessing functions contain two parts: image denoising and image edge detection. Image denoising uses nine kinds of filters to denoise 13 kinds of noises, respectively. In the aspect of edge detection, six edge detection operators are used to detect the edges of the images, and two evaluation indicators, peak-signal to noise ratio and mean structural similarity, are used for evaluation. Image segmentation includes single-object image segmentation and multi-object image segmentation. Six methods are used for single-object image segmentation, while k-means and U-net are used for multi-object segmentation.We extract nine features from the images in EMDS-5 and use the Support Vector Machine classifier for testing. In terms of image classification, we select the VGG16 feature to test different classifiers. We test two types of retrieval approaches: texture feature retrieval and deep learning feature retrieval. We select the last layer of features of these two deep learning networks as feature vectors. We use mean average precision as the evaluation index for retrieval.
CVMar 8, 2020
A Multi-scale CNN-CRF Framework for Environmental Microorganism Image SegmentationJinghua Zhang, Chen Li, Frank Kulwa et al.
To assist researchers to identify Environmental Microorganisms (EMs) effectively, a Multiscale CNN-CRF (MSCC) framework for the EM image segmentation is proposed in this paper. There are two parts in this framework: The first is a novel pixel-level segmentation approach, using a newly introduced Convolutional Neural Network (CNN), namely, "mU-Net-B3", with a dense Conditional Random Field (CRF) postprocessing. The second is a VGG-16 based patch-level segmentation method with a novel "buffer" strategy, which further improves the segmentation quality of the details of the EMs. In the experiment, compared with the state-of-the-art methods on 420 EM images, the proposed MSCC method reduces the memory requirement from 355 MB to 103 MB, improves the overall evaluation indexes (Dice, Jaccard, Recall, Accuracy) from 85.24%, 77.42%, 82.27%, and 96.76% to 87.13%, 79.74%, 87.12%, and 96.91%, respectively, and reduces the volume overlap error from 22.58% to 20.26%. Therefore, the MSCC method shows great potential in the EM segmentation field.
CVMar 3, 2020
Gastric histopathology image segmentation using a hierarchical conditional random fieldChanghao Sun, Chen Li, Jinghua Zhang et al.
For the Convolutional Neural Networks (CNNs) applied in the intelligent diagnosis of gastric cancer, existing methods mostly focus on individual characteristics or network frameworks without a policy to depict the integral information. Mainly, Conditional Random Field (CRF), an efficient and stable algorithm for analyzing images containing complicated contents, can characterize spatial relation in images. In this paper, a novel Hierarchical Conditional Random Field (HCRF) based Gastric Histopathology Image Segmentation (GHIS) method is proposed, which can automatically localize abnormal (cancer) regions in gastric histopathology images obtained by an optical microscope to assist histopathologists in medical work. This HCRF model is built up with higher order potentials, including pixel-level and patch-level potentials, and graph-based post-processing is applied to further improve its segmentation performance. Especially, a CNN is trained to build up the pixel-level potentials and another three CNNs are fine-tuned to build up the patch-level potentials for sufficient spatial segmentation information. In the experiment, a hematoxylin and eosin (H&E) stained gastric histopathological dataset with 560 abnormal images are divided into training, validation and test sets with a ratio of 1 : 1 : 2. Finally, segmentation accuracy, recall and specificity of 78.91%, 65.59%, and 81.33% are achieved on the test set. Our HCRF model demonstrates high segmentation performance and shows its effectiveness and future potential in the GHIS field.