CVNov 21, 2022
RIC-CNN: Rotation-Invariant Coordinate Convolutional Neural NetworkHanlin Mo, Guoying Zhao
In recent years, convolutional neural network has shown good performance in many image processing and computer vision tasks. However, a standard CNN model is not invariant to image rotations. In fact, even slight rotation of an input image will seriously degrade its performance. This shortcoming precludes the use of CNN in some practical scenarios. Thus, in this paper, we focus on designing convolutional layer with good rotation invariance. Specifically, based on a simple rotation-invariant coordinate system, we propose a new convolutional operation, called Rotation-Invariant Coordinate Convolution (RIC-C). Without additional trainable parameters and data augmentation, RIC-C is naturally invariant to arbitrary rotations around the input center. Furthermore, we find the connection between RIC-C and deformable convolution, and propose a simple but efficient approach to implement RIC-C using Pytorch. By replacing all standard convolutional layers in a CNN with the corresponding RIC-C, a RIC-CNN can be derived. Using MNIST dataset, we first evaluate the rotation invariance of RIC-CNN and compare its performance with most of existing rotation-invariant CNN models. It can be observed that RIC-CNN achieves the state-of-the-art classification on the rotated test dataset of MNIST. Then, we deploy RIC-C to VGG, ResNet and DenseNet, and conduct the classification experiments on two real image datasets. Also, a shallow CNN and the corresponding RIC-CNN are trained to extract image patch descriptors, and we compare their performance in patch verification. These experimental results again show that RIC-C can be easily used as drop in replacement for standard convolutions, and greatly enhances the rotation invariance of CNN models designed for different applications.
CVMar 25, 2023
Image Moment Invariants to Rotational Motion BlurHanlin Mo, Hongxiang Hao, Guoying Zhao
Rotational motion blur caused by the circular motion of the camera or/and object is common in life. Identifying objects from images affected by rotational motion blur is challenging because this image degradation severely impacts image quality. Therefore, it is meaningful to develop image invariant features under rotational motion blur and then use them in practical tasks, such as object classification and template matching. This paper proposes a novel method to generate image moment invariants under general rotational motion blur and provides some instances. Further, we achieve their invariance to similarity transform. To the best of our knowledge, this is the first time that moment invariants for rotational motion blur have been proposed in the literature. We conduct extensive experiments on various image datasets disturbed by similarity transform and rotational motion blur to test these invariants' numerical stability and robustness to image noise. We also demonstrate their performance in image classification and handwritten digit recognition. Current state-of-the-art blur moment invariants and deep neural networks are chosen for comparison. Our results show that the moment invariants proposed in this paper significantly outperform other features in various tasks.
SPMay 12
Modulation Consistency-based Contrastive Learning for Self-Supervised Automatic Modulation ClassificationChenxu Wang, Shuang Wang, Lirong Han et al.
Deep learning-based AMC methods have achieved remarkable performance, but their practical deployment remains constrained by the high cost of labeled data. Although self-supervised learning (SSL) reduces the reliance on labels, existing SSL-based AMC methods often rely on task-agnostic pretext objectives misaligned with modulation classification, leading to representations entangled with nuisance factors such as symbol, channel, and noise. In this paper, we identify intra-instance modulation consistency as a task-aware structural prior, whereby different temporal segments of the same signal may differ in waveform while preserving the same modulation type, thus providing a principled cue for task-aligned self-supervision. Based on this prior, we propose Mod-CL, a Modulation consistency-based Contrastive Learning framework that constructs positive pairs from different temporal segments of the same signal instance, to encourage the model to learn shared modulation information while suppressing nuisance variations. We further develop a contrastive objective tailored to Mod-CL, which jointly exploits temporal segmentation and data augmentation to pull together views sharing the same modulation semantics while avoiding supervisory conflicts within each signal instance. Extensive experiments on RadioML datasets show that Mod-CL consistently outperforms strong baselines, especially in low-label regimes, achieving substantial improvements in linear probing accuracy.
CVApr 17, 2024
Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-AssuredHanlin Mo, Guoying Zhao
Achieving rotation invariance in deep neural networks without relying on data has always been a hot research topic. Intrinsic rotation invariance can enhance the model's feature representation capability, enabling better performance in tasks such as multi-orientation object recognition and detection. Based on various types of non-learnable operators, including gradient, sort, local binary pattern, maximum, etc., this paper designs a set of new convolution operations that are natually invariant to arbitrary rotations. Unlike most previous studies, these rotation-invariant convolutions (RIConvs) have the same number of learnable parameters and a similar computational process as conventional convolution operations, allowing them to be interchangeable. Using the MNIST-Rot dataset, we first verify the invariance of these RIConvs under various rotation angles and compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs). Two types of RIConvs based on gradient operators achieve state-of-the-art results. Subsequently, we combine RIConvs with different types and depths of classic CNN backbones. Using the OuTex_00012, MTARSI, and NWPU-RESISC-45 datasets, we test their performance on texture recognition, aircraft type recognition, and remote sensing image classification tasks. The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited. Furthermore, we find that even with data augmentation, RIConvs can further enhance model performance.
CVMay 23, 2023
Sorted Convolutional Network for Achieving Continuous Rotational InvarianceHanlin Mo, Guoying Zhao
The topic of achieving rotational invariance in convolutional neural networks (CNNs) has gained considerable attention recently, as this invariance is crucial for many computer vision tasks such as image classification and matching. In this letter, we propose a Sorting Convolution (SC) inspired by some hand-crafted features of texture images, which achieves continuous rotational invariance without requiring additional learnable parameters or data augmentation. Further, SC can directly replace the conventional convolution operations in a classic CNN model to achieve its rotational invariance. Based on MNIST-rot dataset, we first analyze the impact of convolutional kernel sizes, different sampling and sorting strategies on SC's rotational invariance, and compare our method with previous rotation-invariant CNN models. Then, we combine SC with VGG, ResNet and DenseNet, and conduct classification experiments on popular texture and remote sensing image datasets. Our results demonstrate that SC achieves the best performance in the aforementioned tasks.
CVJan 3, 2022
Gaussian-Hermite Moment Invariants of General Multi-Channel FunctionsHanlin Mo, Hua Li, Guoying Zhao
With the development of data acquisition technology, large amounts of multi-channel data are collected and widely used in many fields. Most of them, such as RGB images and vector fields, can be expressed as different types of multi-channel functions. Feature extraction of multi-channel data for identifying interest patterns is a critical but challenging task. This paper focuses on constructing moment-based features of general multi-channel functions. Specifically, we define two transform models, rotation-affine transform and total rotation transform, to describe real deformations of multi-channel data. Then, we design a structural framework to generate Gaussian-Hermite moment invariants for these two transform models systematically. It is the first time that a unified framework has been proposed in the literature to construct orthogonal moment invariants of general multi-channel functions. Given a specific type of multi-channel data, we demonstrate how to utilize the new method to derive all possible invariants and eliminate dependences among them. We obtain independent sets of invariants with low orders and low degrees for RGB images, 2D vector fields and color volume data. Based on synthetic and real multi-channel data, we conduct extensive experiments to evaluate the stability and discriminability of these invariants and their robustness to noise. The results show that new moment invariants significantly outperform previous moment invariants of multi-channel data in RGB image classification and vortex detection in 2D vector fields.
CVApr 21, 2020
Spatio-Temporal Dual Affine Differential Invariant for Skeleton-based Action RecognitionQi Li, Hanlin Mo, Jinghan Zhao et al.
The dynamics of human skeletons have significant information for the task of action recognition. The similarity between trajectories of corresponding joints is an indicating feature of the same action, while this similarity may subject to some distortions that can be modeled as the combination of spatial and temporal affine transformations. In this work, we propose a novel feature called spatio-temporal dual affine differential invariant (STDADI). Furthermore, in order to improve the generalization ability of neural networks, a channel augmentation method is proposed. On the large scale action recognition dataset NTU-RGB+D, and its extended version NTU-RGB+D 120, it achieves remarkable improvements over previous state-of-the-art methods.
CVNov 19, 2019
Dual affine moment invariantsYou Hao, Hanlin Mo, Qi Li et al.
Affine transformation is one of the most common transformations in nature, which is an important issue in the field of computer vision and shape analysis. And affine transformations often occur in both shape and color space simultaneously, which can be termed as Dual-Affine Transformation (DAT). In general, we should derive invariants of different data formats separately, such as 2D color images, 3D color objects, or even higher-dimensional data. To the best of our knowledge, there is no general framework to derive invariants for all of these data formats. In this paper, we propose a general framework to derive moment invariants under DAT for objects in M-dimensional space with N channels, which can be called dual-affine moment invariants (DAMI). Following this framework, we present the generating formula of DAMI under DAT for 3D color objects. Then, we instantiated a complete set of DAMI for 3D color objects with orders and degrees no greater than 4. Finally, we analyze the characteristic of these DAMI and conduct classification experiments to evaluate the stability and discriminability of them. The results prove that DAMI is robust for DAT. Our derivation framework can be applied to data in any dimension with any number of channels.
CVNov 13, 2019
Rotation Differential Invariants of Images Generated by Two Fundamental Differential OperatorsHanlin Mo, Hua Li
In this paper, we design two fundamental differential operators for the derivation of rotation differential invariants of images. Each differential invariant obtained by using the new method can be expressed as a homogeneous polynomial of image partial derivatives, which preserve their values when the image is rotated by arbitrary angles. We produce all possible instances of homogeneous invariants up to the given order and degree, and discuss the independence of them in detail. As far as we know, no previous papers have published so many explicit forms of high-order rotation differential invariants of images. In the experimental part, texture classification and image patch verification are carried out on popular real databases. These rotation differential invariants are used as image feature vector. We mainly evaluate the effects of various factors on the performance of them. The experimental results also validate that they have better performance than some commonly used image features in some cases.
GRAug 30, 2018
Differential and integral invariants under Mobius transformationHe Zhang, Hanlin Mo, You Hao et al.
One of the most challenging problems in the domain of 2-D image or 3-D shape is to handle the non-rigid deformation. From the perspective of transformation groups, the conformal transformation is a key part of the diffeomorphism. According to the Liouville Theorem, an important part of the conformal transformation is the Mobius transformation, so we focus on Mobius transformation and propose two differential expressions that are invariable under 2-D and 3-D Mobius transformation respectively. Next, we analyze the absoluteness and relativity of invariance on them and their components. After that, we propose integral invariants under Mobius transformation based on the two differential expressions. Finally, we propose a conjecture about the structure of differential invariants under conformal transformation according to our observation on the composition of the above two differential invariants.
SENov 21, 2017
Grouping Environmental Factors Influencing Individual Decision-Making Behavior in Software Projects: A Cluster AnalysisJingdong Jia, Hanlin Mo, Luiz Fernando Capretz et al.
An individual decision-making behavior is heavily influenced by and adapted to external environmental factors. Given that software development is a human-centered activity, individual decision-making behavior may affect the software project quality. Although environmental factors affecting decision-making behavior in software projects have been identified in prior literature, there is not yet an objective and a full taxonomy of these factors. Thus, it is not trivial to manage these complex and diverse factors. To address this deficiency, we first design a semantic similarity algorithm between words by utilizing the synonymy and hypernymy relationships in WordNet. Further, we propose a method to measure semantic similarity between phrases and apply it into k-means clustering algorithm to group these factors. Subsequently, we obtain a taxonomy of the environmental factors affecting individual decision-making behavior in software projects, which includes eleven broad categories, each containing two to five sub-categories. The taxonomy presented herein is obtained by an objective method, and quite comprehensive, with appropriate references provided. The taxonomy holds significant value for researchers and practitioners; it can help them to better understand the major aspects of environmental factors, also to predict and guide the behavior of individuals during decision making towards a successful completion of software projects.
COMP-PHOct 20, 2017
Fast and Efficient Calculations of Structural Invariants of ChiralityHe Zhang, Hanlin Mo, You Hao et al.
Chirality plays an important role in physics, chemistry, biology, and other fields. It describes an essential symmetry in structure. However, chirality invariants are usually complicated in expression or difficult to evaluate. In this paper, we present five general three-dimensional chirality invariants based on the generating functions. And the five chiral invariants have four characteristics:(1) They play an important role in the detection of symmetry, especially in the treatment of 'false zero' problem. (2) Three of the five chiral invariants decode an universal chirality index. (3) Three of them are proposed for the first time. (4) The five chiral invariants have low order no bigger than 4, brief expression, low time complexity O(n) and can act as descriptors of three-dimensional objects in shape analysis. The five chiral invariants give a geometric view to study the chiral invariants. And the experiments show that the five chirality invariants are effective and efficient, they can be used as a tool for symmetry detection or features in shape analysis.
CVJul 19, 2017
Image Projective InvariantsErbo Li, Hanlin Mo, Dong Xu et al.
In this paper, we propose relative projective differential invariants (RPDIs) which are invariant to general projective transformations. By using RPDIs and the structural frame of integral invariant, projective weighted moment invariants (PIs) can be constructed very easily. It is first proved that a kind of projective invariants exists in terms of weighted integration of images, with relative differential invariants as the weight functions. Then, some simple instances of PIs are given. In order to ensure the stability and discriminability of PIs, we discuss how to calculate partial derivatives of discrete images more accurately. Since the number of pixels in discrete images before and after the geometric transformation may be different, we design the method to normalize the number of pixels. These ways enhance the performance of PIs. Finally, we carry out some experiments based on synthetic and real image datasets. We choose commonly used moment invariants for comparison. The results indicate that PIs have better performance than other moment invariants in image retrieval and classification. With PIs, one can compare the similarity between images under the projective transformation without knowing the parameters of the transformation, which provides a good tool to shape analysis in image processing, computer vision and pattern recognition.
CVJun 14, 2017
Shape-Color Differential Moment Invariants under Affine TransformationsHanlin Mo, Shirui Li, You Hao et al.
We propose the general construction formula of shape-color primitives by using partial differentials of each color channel in this paper. By using all kinds of shape-color primitives, shape-color differential moment invariants can be constructed very easily, which are invariant to the shape affine and color affine transforms. 50 instances of SCDMIs are obtained finally. In experiments, several commonly used color descriptors and SCDMIs are used in image classification and retrieval of color images, respectively. By comparing the experimental results, we find that SCDMIs get better results.
CVJun 5, 2017
A Kind of Affine Weighted Moment InvariantsHanlin Mo, You Hao, Shirui Li et al.
A new kind of geometric invariants is proposed in this paper, which is called affine weighted moment invariant (AWMI). By combination of local affine differential invariants and a framework of global integral, they can more effectively extract features of images and help to increase the number of low-order invariants and to decrease the calculating cost. The experimental results show that AWMIs have good stability and distinguishability and achieve better results in image retrieval than traditional moment invariants. An extension to 3D is straightforward.
CVMay 31, 2017
Naturally Combined Shape-Color Moment Invariants under Affine TransformationsMing Gong, You Hao, Hanlin Mo et al.
We proposed a kind of naturally combined shape-color affine moment invariants (SCAMI), which consider both shape and color affine transformations simultaneously in one single system. In the real scene, color and shape deformations always exist in images simultaneously. Simple shape invariants or color invariants can not be qualified for this situation. The conventional method is just to make a simple linear combination of the two factors. Meanwhile, the manual selection of weights is a complex issue. Our construction method is based on the multiple integration framework. The integral kernel is assigned as the continued product of the shape and color invariant cores. It is the first time to directly derive an invariant to dual affine transformations of shape and color. The manual selection of weights is no longer necessary, and both the shape and color transformations are extended to affine transformation group. With the various of invariant cores, a set of lower-order invariants are constructed and the completeness and independence are discussed detailedly. A set of SCAMIs, which called SCAMI24, are recommended, and the effectiveness and robustness have been evaluated on both synthetic and real datasets.
CVMay 19, 2017
Affine-Gradient Based Local Binary Pattern Descriptor for Texture ClassifficationYou Hao, Shirui Li, Hanlin Mo et al.
We present a novel Affine-Gradient based Local Binary Pattern (AGLBP) descriptor for texture classification. It is very hard to describe complicated texture using single type information, such as Local Binary Pattern (LBP), which just utilizes the sign information of the difference between the pixel and its local neighbors. Our descriptor has three characteristics: 1) In order to make full use of the information contained in the texture, the Affine-Gradient, which is different from Euclidean-Gradient and invariant to affine transformation is incorporated into AGLBP. 2) An improved method is proposed for rotation invariance, which depends on the reference direction calculating respect to local neighbors. 3) Feature selection method, considering both the statistical frequency and the intraclass variance of the training dataset, is also applied to reduce the dimensionality of descriptors. Experiments on three standard texture datasets, Outex12, Outex10 and KTH-TIPS2, are conducted to evaluate the performance of AGLBP. The results show that our proposed descriptor gets better performance comparing to some state-of-the-art rotation texture descriptors in texture classification.