Xin Luo

h-index78

27papers

110citations

Novelty52%

AI Score40

Ranked #73,496 of 194,257 authors (top 38%)#16,433 in LG (top 41%)

27 Papers

2.6CVMar 24, 2022

ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator

Zi-Chao Zhang, Zhen-Duo Chen, Yongxin Wang et al.

Recently, several Vision Transformer (ViT) based methods have been proposed for Fine-Grained Visual Classification (FGVC).These methods significantly surpass existing CNN-based ones, demonstrating the effectiveness of ViT in FGVC tasks.However, there are some limitations when applying ViT directly to FGVC.First, ViT needs to split images into patches and calculate the attention of every pair, which may result in heavy redundant calculation and unsatisfying performance when handling fine-grained images with complex background and small objects.Second, a standard ViT only utilizes the class token in the final layer for classification, which is not enough to extract comprehensive fine-grained information. To address these issues, we propose a novel ViT based fine-grained object discriminator for FGVC tasks, ViT-FOD for short. Specifically, besides a ViT backbone, it further introduces three novel components, i.e, Attention Patch Combination (APC), Critical Regions Filter (CRF), and Complementary Tokens Integration (CTI). Thereinto, APC pieces informative patches from two images to generate a new image so that the redundant calculation can be reduced. CRF emphasizes tokens corresponding to discriminative regions to generate a new class token for subtle feature learning. To extract comprehensive information, CTI integrates complementary information captured by class tokens in different ViT layers. We conduct comprehensive experiments on widely used datasets and the results demonstrate that ViT-FOD is able to achieve state-of-the-art performance.

1.2SIFeb 23, 2023

A Constraints Fusion-induced Symmetric Nonnegative Matrix Factorization Approach for Community Detection

Zhigang Liu, Xin Luo

Community is a fundamental and critical characteristic of an undirected social network, making community detection be a vital yet thorny issue in network representation learning. A symmetric and non-negative matrix factorization (SNMF) model is frequently adopted to address this issue owing to its great interpretability and scalability. However, it adopts a single latent factor matrix to represent an undirected network for precisely representing its symmetry, which leads to loss of representation learning ability due to the reduced latent space. Motivated by this discovery, this paper proposes a novel Constraints Fusion-induced Symmetric Nonnegative Matrix Factorization (CFS) model that adopts three-fold ideas: a) Representing a target undirected network with multiple latent factor matrices, thus preserving its representation learning capacity; b) Incorporating a symmetry-regularizer that preserves the symmetry of the learnt low-rank approximation to the adjacency matrix into the loss function, thus making the resultant detector well-aware of the target network's symmetry; and c) Introducing a graph-regularizer that preserves local invariance of the network's intrinsic geometry, thus making the achieved detector well-aware of community structure within the target network. Extensively empirical studies on eight real-world social networks from industrial applications demonstrate that the proposed CFS model significantly outperforms state-of-the-art models in achieving highly-accurate community detection results.

3.3LGApr 16, 2022

A Multi-Metric Latent Factor Model for Analyzing High-Dimensional and Sparse data

Di Wu, Peng Zhang, Yi He et al.

High-dimensional and sparse (HiDS) matrices are omnipresent in a variety of big data-related applications. Latent factor analysis (LFA) is a typical representation learning method that extracts useful yet latent knowledge from HiDS matrices via low-rank approximation. Current LFA-based models mainly focus on a single-metric representation, where the representation strategy designed for the approximation Loss function, is fixed and exclusive. However, real-world HiDS matrices are commonly heterogeneous and inclusive and have diverse underlying patterns, such that a single-metric representation is most likely to yield inferior performance. Motivated by this, we in this paper propose a multi-metric latent factor (MMLF) model. Its main idea is two-fold: 1) two vector spaces and three Lp-norms are simultaneously employed to develop six variants of LFA model, each of which resides in a unique metric representation space, and 2) all the variants are ensembled with a tailored, self-adaptive weighting strategy. As such, our proposed MMLF enjoys the merits originated from a set of disparate metric spaces all at once, achieving the comprehensive and unbiased representation of HiDS matrices. Theoretical study guarantees that MMLF attains a performance gain. Extensive experiments on eight real-world HiDS datasets, spanning a wide range of industrial and science domains, verify that our MMLF significantly outperforms ten state-of-the-art, shallow and deep counterparts.

3.3LGOct 27, 2022

Prototype-Based Layered Federated Cross-Modal Hashing

Jiale Liu, Yu-Wei Zhan, Xin Luo et al.

Recently, deep cross-modal hashing has gained increasing attention. However, in many practical cases, data are distributed and cannot be collected due to privacy concerns, which greatly reduces the cross-modal hashing performance on each client. And due to the problems of statistical heterogeneity, model heterogeneity, and forcing each client to accept the same parameters, applying federated learning to cross-modal hash learning becomes very tricky. In this paper, we propose a novel method called prototype-based layered federated cross-modal hashing. Specifically, the prototype is introduced to learn the similarity between instances and classes on server, reducing the impact of statistical heterogeneity (non-IID) on different clients. And we monitor the distance between local and global prototypes to further improve the performance. To realize personalized federated learning, a hypernetwork is deployed on server to dynamically update different layers' weights of local model. Experimental results on benchmark datasets show that our method outperforms state-of-the-art methods.

2.0LGFeb 25, 2023

Online Sparse Streaming Feature Selection Using Adapted Classification

RuiYang Xu, Di Wu, Xin Luo

Traditional feature selections need to know the feature space before learning, and online streaming feature selection (OSFS) is proposed to process streaming features on the fly. Existing methods divide features into relevance or irrelevance without missing data, and deleting irrelevant features may lead to in-formation loss. Motivated by this, we focus on completing the streaming feature matrix and division of feature correlation and propose online sparse streaming feature selection based on adapted classification (OS2FS-AC). This study uses Latent Factor Analysis (LFA) to pre-estimate missed data. Besides, we use the adaptive method to obtain the threshold, divide the features into strongly relevant, weakly relevant, and irrelevant features, and then divide weak relevance with more information. Experimental results on ten real-world data sets demonstrate that OS2FS-AC performs better than state-of-the-art algo-rithms.

2.3SIMar 8, 2022

High-order Order Proximity-Incorporated, Symmetry and Graph-Regularized Nonnegative Matrix Factorization for Community Detection

Zhigang Liu, Xin Luo

Community describes the functional mechanism of a network, making community detection serve as a fundamental graph tool for various real applications like discovery of social circle. To date, a Symmetric and Non-negative Matrix Factorization (SNMF) model has been frequently adopted to address this issue owing to its high interpretability and scalability. However, most existing SNMF-based community detection methods neglect the high-order connection patterns in a network. Motivated by this discovery, in this paper, we propose a High-Order Proximity (HOP)-incorporated, Symmetry and Graph-regularized NMF (HSGN) model that adopts the following three-fold ideas: a) adopting a weighted pointwise mutual information (PMI)-based approach to measure the HOP indices among nodes in a network; b) leveraging an iterative reconstruction scheme to encode the captured HOP into the network; and c) introducing a symmetry and graph-regularized NMF algorithm to detect communities accurately. Extensive empirical studies on eight real-world networks demonstrate that an HSGN-based community detector significantly outperforms both benchmark and state-of-the-art community detectors in providing highly-accurate community detection results.

2.6CVOct 28, 2022

FedVMR: A New Federated Learning method for Video Moment Retrieval

Yan Wang, Xin Luo, Zhen-Duo Chen et al.

Despite the great success achieved, existing video moment retrieval (VMR) methods are developed under the assumption that data are centralizedly stored. However, in real-world applications, due to the inherent nature of data generation and privacy concerns, data are often distributed on different silos, bringing huge challenges to effective large-scale training. In this work, we try to overcome above limitation by leveraging the recent success of federated learning. As the first that is explored in VMR field, the new task is defined as video moment retrieval with distributed data. Then, a novel federated learning method named FedVMR is proposed to facilitate large-scale and secure training of VMR models in decentralized environment. Experiments on benchmark datasets demonstrate its effectiveness. This work is the very first attempt to enable safe and efficient VMR training in decentralized scene, which is hoped to pave the way for further study in the related research field.

2.0LGJun 6, 2023

Multi-constrained Symmetric Nonnegative Latent Factor Analysis for Accurately Representing Large-scale Undirected Weighted Networks

Yurong Zhong, Zhe Xie, Weiling Li et al.

An Undirected Weighted Network (UWN) is frequently encountered in a big-data-related application concerning the complex interactions among numerous nodes, e.g., a protein interaction network from a bioinformatics application. A Symmetric High-Dimensional and Incomplete (SHDI) matrix can smoothly illustrate such an UWN, which contains rich knowledge like node interaction behaviors and local complexes. To extract desired knowledge from an SHDI matrix, an analysis model should carefully consider its symmetric-topology for describing an UWN's intrinsic symmetry. Representation learning to an UWN borrows the success of a pyramid of symmetry-aware models like a Symmetric Nonnegative Matrix Factorization (SNMF) model whose objective function utilizes a sole Latent Factor (LF) matrix for representing SHDI's symmetry rigorously. However, they suffer from the following drawbacks: 1) their computational complexity is high; and 2) their modeling strategy narrows their representation features, making them suffer from low learning ability. Aiming at addressing above critical issues, this paper proposes a Multi-constrained Symmetric Nonnegative Latent-factor-analysis (MSNL) model with two-fold ideas: 1) introducing multi-constraints composed of multiple LF matrices, i.e., inequality and equality ones into a data-density-oriented objective function for precisely representing the intrinsic symmetry of an SHDI matrix with broadened feature space; and 2) implementing an Alternating Direction Method of Multipliers (ADMM)-incorporated learning scheme for precisely solving such a multi-constrained model. Empirical studies on three SHDI matrices from a real bioinformatics or industrial application demonstrate that the proposed MSNL model achieves stronger representation learning ability to an SHDI matrix than state-of-the-art models do.

2.0LGJun 6, 2023

Proximal Symmetric Non-negative Latent Factor Analysis: A Novel Approach to Highly-Accurate Representation of Undirected Weighted Networks

Yurong Zhong, Zhe Xie, Weiling Li et al.

An Undirected Weighted Network (UWN) is commonly found in big data-related applications. Note that such a network's information connected with its nodes, and edges can be expressed as a Symmetric, High-Dimensional and Incomplete (SHDI) matrix. However, existing models fail in either modeling its intrinsic symmetry or low-data density, resulting in low model scalability or representation learning ability. For addressing this issue, a Proximal Symmetric Nonnegative Latent-factor-analysis (PSNL) model is proposed. It incorporates a proximal term into symmetry-aware and data density-oriented objective function for high representation accuracy. Then an adaptive Alternating Direction Method of Multipliers (ADMM)-based learning scheme is implemented through a Tree-structured of Parzen Estimators (TPE) method for high computational efficiency. Empirical studies on four UWNs demonstrate that PSNL achieves higher accuracy gain than state-of-the-art models, as well as highly competitive computational efficiency.

2.0IRDec 20, 2022

Multi-Metric AutoRec for High Dimensional and Sparse User Behavior Data Prediction

Cheng Liang, Teng Huang, Yi He et al.

User behavior data produced during interaction with massive items in the significant data era are generally heterogeneous and sparse, leaving the recommender system (RS) a large diversity of underlying patterns to excavate. Deep neural network-based models have reached the state-of-the-art benchmark of the RS owing to their fitting capabilities. However, prior works mainly focus on designing an intricate architecture with fixed loss function and regulation. These single-metric models provide limited performance when facing heterogeneous and sparse user behavior data. Motivated by this finding, we propose a multi-metric AutoRec (MMA) based on the representative AutoRec. The idea of the proposed MMA is mainly two-fold: 1) apply different $L_p$-norm on loss function and regularization to form different variant models in different metric spaces, and 2) aggregate these variant models. Thus, the proposed MMA enjoys the multi-metric orientation from a set of dispersed metric spaces, achieving a comprehensive representation of user data. Theoretical studies proved that the proposed MMA could attain performance improvement. The extensive experiment on five real-world datasets proves that MMA can outperform seven other state-of-the-art models in predicting unobserved user behavior data.

1.8LGMay 5, 2022

PI-NLF: A Proportional-Integral Approach for Non-negative Latent Factor Analysis

Ye Yuan, Xin Luo

A high-dimensional and incomplete (HDI) matrix frequently appears in various big-data-related applications, which demonstrates the inherently non-negative interactions among numerous nodes. A non-negative latent factor (NLF) model performs efficient representation learning to an HDI matrix, whose learning process mostly relies on a single latent factor-dependent, non-negative and multiplicative update (SLF-NMU) algorithm. However, an SLF-NMU algorithm updates a latent factor based on the current update increment only without appropriate considerations of past learning information, resulting in slow convergence. Inspired by the prominent success of a proportional-integral (PI) controller in various applications, this paper proposes a Proportional-Integral-incorporated Non-negative Latent Factor (PI-NLF) model with two-fold ideas: a) establishing an Increment Refinement (IR) mechanism via considering the past update increments following the principle of a PI controller; and b) designing an IR-based SLF-NMU (ISN) algorithm to accelerate the convergence rate of a resultant model. Empirical studies on four HDI datasets demonstrate that a PI-NLF model outperforms the state-of-the-art models in both computational efficiency and estimation accuracy for missing data of an HDI matrix. Hence, this study unveils the feasibility of boosting the performance of a non-negative learning algorithm through an error feedback controller.

1.8LGApr 16, 2022

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Di Wu, Yi He, Xin Luo

A High-dimensional and sparse (HiDS) matrix is frequently encountered in a big data-related application like an e-commerce system or a social network services system. To perform highly accurate representation learning on it is of great significance owing to the great desire of extracting latent knowledge and patterns from it. Latent factor analysis (LFA), which represents an HiDS matrix by learning the low-rank embeddings based on its observed entries only, is one of the most effective and efficient approaches to this issue. However, most existing LFA-based models perform such embeddings on a HiDS matrix directly without exploiting its hidden graph structures, thereby resulting in accuracy loss. To address this issue, this paper proposes a graph-incorporated latent factor analysis (GLFA) model. It adopts two-fold ideas: 1) a graph is constructed for identifying the hidden high-order interaction (HOI) among nodes described by an HiDS matrix, and 2) a recurrent LFA structure is carefully designed with the incorporation of HOI, thereby improving the representa-tion learning ability of a resultant model. Experimental results on three real-world datasets demonstrate that GLFA outperforms six state-of-the-art models in predicting the missing data of an HiDS matrix, which evidently supports its strong representation learning ability to HiDS data.

2.6CVNov 27, 2022Code

A Knowledge-based Learning Framework for Self-supervised Pre-training Towards Enhanced Recognition of Biomedical Microscopy Images

Wei Chen, Chen Li, Dan Chen et al.

Self-supervised pre-training has become the priory choice to establish reliable neural networks for automated recognition of massive biomedical microscopy images, which are routinely annotation-free, without semantics, and without guarantee of quality. Note that this paradigm is still at its infancy and limited by closely related open issues: 1) how to learn robust representations in an unsupervised manner from unlabelled biomedical microscopy images of low diversity in samples? and 2) how to obtain the most significant representations demanded by a high-quality segmentation? Aiming at these issues, this study proposes a knowledge-based learning framework (TOWER) towards enhanced recognition of biomedical microscopy images, which works in three phases by synergizing contrastive learning and generative learning methods: 1) Sample Space Diversification: Reconstructive proxy tasks have been enabled to embed a priori knowledge with context highlighted to diversify the expanded sample space; 2) Enhanced Representation Learning: Informative noise-contrastive estimation loss regularizes the encoder to enhance representation learning of annotation-free images; 3) Correlated Optimization: Optimization operations in pre-training the encoder and the decoder have been correlated via image restoration from proxy tasks, targeting the need for semantic segmentation. Experiments have been conducted on public datasets of biomedical microscopy images against the state-of-the-art counterparts (e.g., SimCLR and BYOL), and results demonstrate that: TOWER statistically excels in all self-supervised methods, achieving a Dice improvement of 1.38 percentage points over SimCLR. TOWER also has potential in multi-modality medical image analysis and enables label-efficient semi-supervised learning, e.g., reducing the annotation cost by up to 99% in pathological classification.

3.7CVApr 12, 2022

Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval

Yu-Wei Zhan, Xin Luo, Yongxin Wang et al.

The Zero-Shot Sketch-based Image Retrieval (ZS-SBIR) is a challenging task because of the large domain gap between sketches and natural images as well as the semantic inconsistency between seen and unseen categories. Previous literature bridges seen and unseen categories by semantic embedding, which requires prior knowledge of the exact class names and additional extraction efforts. And most works reduce domain gap by mapping sketches and natural images into a common high-level space using constructed sketch-image pairs, which ignore the unpaired information between images and sketches. To address these issues, in this paper, we propose a novel Three-Stream Joint Training Network (3JOIN) for the ZS-SBIR task. To narrow the domain differences between sketches and images, we extract edge maps for natural images and treat them as a bridge between images and sketches, which have similar content to images and similar style to sketches. For exploiting a sufficient combination of sketches, natural images, and edge maps, a novel three-stream joint training network is proposed. In addition, we use a teacher network to extract the implicit semantics of the samples without the aid of other semantics and transfer the learned knowledge to unseen classes. Extensive experiments conducted on two real-world datasets demonstrate the superiority of our proposed method.

3.3LGNov 30, 2022

A Node-collaboration-informed Graph Convolutional Network for Precise Representation to Undirected Weighted Graphs

Ying Wang, Ye Yuan, Xin Luo

An undirected weighted graph (UWG) is frequently adopted to describe the interactions among a solo set of nodes from real applications, such as the user contact frequency from a social network services system. A graph convolutional network (GCN) is widely adopted to perform representation learning to a UWG for subsequent pattern analysis tasks such as clustering or missing data estimation. However, existing GCNs mostly neglects the latent collaborative information hidden in its connected node pairs. To address this issue, this study proposes to model the node collaborations via a symmetric latent factor analysis model, and then regards it as a node-collaboration module for supplementing the collaboration loss in a GCN. Based on this idea, a Node-collaboration-informed Graph Convolutional Network (NGCN) is proposed with three-fold ideas: a) Learning latent collaborative information from the interaction of node pairs via a node-collaboration module; b) Building the residual connection and weighted representation propagation to obtain high representation capacity; and c) Implementing the model optimization in an end-to-end fashion to achieve precise representation to the target UWG. Empirical studies on UWGs emerging from real applications demonstrate that owing to its efficient incorporation of node-collaborations, the proposed NGCN significantly outperforms state-of-the-art GCNs in addressing the task of missing weight estimation. Meanwhile, its good scalability ensures its compatibility with more advanced GCN extensions, which will be further investigated in our future studies.

5.4AISep 19, 2023

A Dynamic Linear Bias Incorporation Scheme for Nonnegative Latent Factor Analysis

Yurong Zhong, Zhe Xie, Weiling Li et al.

High-Dimensional and Incomplete (HDI) data is commonly encountered in big data-related applications like social network services systems, which are concerning the limited interactions among numerous nodes. Knowledge acquisition from HDI data is a vital issue in the domain of data science due to their embedded rich patterns like node behaviors, where the fundamental task is to perform HDI data representation learning. Nonnegative Latent Factor Analysis (NLFA) models have proven to possess the superiority to address this issue, where a linear bias incorporation (LBI) scheme is important in present the training overshooting and fluctuation, as well as preventing the model from premature convergence. However, existing LBI schemes are all statistic ones where the linear biases are fixed, which significantly restricts the scalability of the resultant NLFA model and results in loss of representation learning ability to HDI data. Motivated by the above discoveries, this paper innovatively presents the dynamic linear bias incorporation (DLBI) scheme. It firstly extends the linear bias vectors into matrices, and then builds a binary weight matrix to switch the active/inactive states of the linear biases. The weight matrix's each entry switches between the binary states dynamically corresponding to the linear bias value variation, thereby establishing the dynamic linear biases for an NLFA model. Empirical studies on three HDI datasets from real applications demonstrate that the proposed DLBI-based NLFA model obtains higher representation accuracy several than state-of-the-art models do, as well as highly-competitive computational efficiency.

18.2CVMar 19, 2025Code

Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training

Yunwei Lan, Zhigao Cui, Chang Liu et al.

Unpaired training has been verified as one of the most effective paradigms for real scene dehazing by learning from unpaired real-world hazy and clear images. Although numerous studies have been proposed, current methods demonstrate limited generalization for various real scenes due to limited feature representation and insufficient use of real-world prior. Inspired by the strong generative capabilities of diffusion models in producing both hazy and clear images, we exploit diffusion prior for real-world image dehazing, and propose an unpaired framework named Diff-Dehazer. Specifically, we leverage diffusion prior as bijective mapping learners within the CycleGAN, a classic unpaired learning framework. Considering that physical priors contain pivotal statistics information of real-world data, we further excavate real-world knowledge by integrating physical priors into our framework. Furthermore, we introduce a new perspective for adequately leveraging the representation ability of diffusion models by removing degradation in image and text modalities, so as to improve the dehazing effect. Extensive experiments on multiple real-world datasets demonstrate the superior performance of our method. Our code https://github.com/ywxjm/Diff-Dehazer.

4.6LGApr 11, 2022

An Adaptive Alternating-direction-method-based Nonnegative Latent Factor Model

Yurong Zhong, Xin Luo

An alternating-direction-method-based nonnegative latent factor model can perform efficient representation learning to a high-dimensional and incomplete (HDI) matrix. However, it introduces multiple hyper-parameters into the learning process, which should be chosen with care to enable its superior performance. Its hyper-parameter adaptation is desired for further enhancing its scalability. Targeting at this issue, this paper proposes an Adaptive Alternating-direction-method-based Nonnegative Latent Factor (A2NLF) model, whose hyper-parameter adaptation is implemented following the principle of particle swarm optimization. Empirical studies on nonnegative HDI matrices generated by industrial applications indicate that A2NLF outperforms several state-of-the-art models in terms of computational and storage efficiency, as well as maintains highly competitive estimation accuracy for an HDI matrix's missing data.

1.8LGApr 2, 2022

A Differential Evolution-Enhanced Latent Factor Analysis Model for High-dimensional and Sparse Data

Jia Chen, Di Wu, Xin Luo

High-dimensional and sparse (HiDS) matrices are frequently adopted to describe the complex relationships in various big data-related systems and applications. A Position-transitional Latent Factor Analysis (PLFA) model can accurately and efficiently represent an HiDS matrix. However, its involved latent factors are optimized by stochastic gradient descent with the specific gradient direction step-by-step, which may cause a suboptimal solution. To address this issue, this paper proposes a Sequential-Group-Differential- Evolution (SGDE) algorithm to refine the latent factors optimized by a PLFA model, thereby achieving a highly-accurate SGDE-PLFA model to HiDS matrices. As demonstrated by the experiments on four HiDS matrices, a SGDE-PLFA model outperforms the state-of-the-art models.

9.4LGNov 28, 2025Code

Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning

Jiajun Guo, Xin Luo, Jiayin Zheng et al.

Multimodal foundation models are increasingly trained on sensitive data across domains such as finance, biomedicine, and personal identifiers. However, this distributed setup raises serious privacy concerns due to the need for cross-partition data sharing. Split learning addresses these concerns by enabling collaborative model training without raw data exchange between partitions, yet it introduces a significant challenge: transmitting high-dimensional intermediate feature representations between partitions leads to substantial communication costs. To address this challenge, we propose Quantized-TinyLLaVA, a multimodal foundation model with an integrated communication-efficient split learning framework. Our approach adopts a compression module that quantizes intermediate feature into discrete representations before transmission, substantially reducing communication overhead. Besides, we derive a principled quantization strategy grounded in entropy coding theory to determine the optimal number of discrete representation levels. We deploy our framework in a two-partition setting, with one partition operating as the client and the other as the server, to realistically simulate distributed training. Under this setup, Quantized-TinyLLaVA achieves an approximate \textbf{87.5\%} reduction in communication overhead with 2-bit quantization, while maintaining performance of the original 16-bit model across five benchmark datasets. Furthermore, our compressed representations exhibit enhanced resilience against feature inversion attacks, validating the privacy of transmission. The code is available at https://github.com/anonymous-1742/Quantized-TinyLLaVA.

2.6LGFeb 19, 2024Code

Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model

Jialiang Wang, Weiling Li, Yurong Zhong et al.

Interactions among large number of entities is naturally high-dimensional and incomplete (HDI) in many big data related tasks. Behavioral characteristics of users are hidden in these interactions, hence, effective representation of the HDI data is a fundamental task for understanding user behaviors. Latent factor analysis (LFA) model has proven to be effective in representing HDI data. The performance of an LFA model relies heavily on its training process, which is a non-convex optimization. It has been proven that incorporating local curvature and preprocessing gradients during its training process can lead to superior performance compared to LFA models built with first-order family methods. However, with the escalation of data volume, the feasibility of second-order algorithms encounters challenges. To address this pivotal issue, this paper proposes a mini-block diagonal hessian-free (Mini-Hes) optimization for building an LFA model. It leverages the dominant diagonal blocks in the generalized Gauss-Newton matrix based on the analysis of the Hessian matrix of LFA model and serves as an intermediary strategy bridging the gap between first-order and second-order optimization methods. Experiment results indicate that, with Mini-Hes, the LFA model outperforms several state-of-the-art models in addressing missing data estimation task on multiple real HDI datasets from recommender system. (The source code of Mini-Hes is available at https://github.com/Goallow/Mini-Hes)

3.7CVFeb 1, 2024

Bias Mitigating Few-Shot Class-Incremental Learning

Li-Jun Zhao, Zhen-Duo Chen, Zi-Chao Zhang et al.

Few-shot class-incremental learning (FSCIL) aims at recognizing novel classes continually with limited novel class samples. A mainstream baseline for FSCIL is first to train the whole model in the base session, then freeze the feature extractor in the incremental sessions. Despite achieving high overall accuracy, most methods exhibit notably low accuracy for incremental classes. Some recent methods somewhat alleviate the accuracy imbalance between base and incremental classes by fine-tuning the feature extractor in the incremental sessions, but they further cause the accuracy imbalance between past and current incremental classes. In this paper, we study the causes of such classification accuracy imbalance for FSCIL, and abstract them into a unified model bias problem. Based on the analyses, we propose a novel method to mitigate model bias of the FSCIL problem during training and inference processes, which includes mapping ability stimulation, separately dual-feature classification, and self-optimizing classifiers. Extensive experiments on three widely-used FSCIL benchmark datasets show that our method significantly mitigates the model bias problem and achieves state-of-the-art performance.

2.0CVDec 29, 2024

Progressively Exploring and Exploiting Cost-Free Data to Break Fine-Grained Classification Barriers

Li-Jun Zhao, Zhen-Duo Chen, Zhi-Yuan Xue et al.

Current fine-grained classification research primarily focuses on fine-grained feature learning. However, in real-world scenarios, fine-grained data annotation is challenging, and the features and semantics are highly diverse and frequently changing. These issues create inherent barriers between traditional experimental settings and real-world applications, limiting the effectiveness of conventional fine-grained classification methods. Although some recent studies have provided potential solutions to these issues, most of them still rely on limited supervised information and thus fail to offer effective solutions. In this paper, based on theoretical analysis, we propose a novel learning paradigm to break the barriers in fine-grained classification. This paradigm enables the model to progressively learn during inference, thereby leveraging cost-free data to more accurately represent fine-grained categories and adapt to dynamic semantic changes. On this basis, an efficient EXPloring and EXPloiting strategy and method (EXP2) is designed. Thereinto, useful inference data samples are explored according to class representations and exploited to optimize classifiers. Experimental results demonstrate the general effectiveness of our method, providing guidance for future in-depth understanding and exploration of real-world fine-grained classification.

3.3LGMar 30, 2022

Adaptive Divergence-based Non-negative Latent Factor Analysis

Ye Yuan, Guangxiao Yuan, Renfang Wang et al.

High-Dimensional and Incomplete (HDI) data are frequently found in various industrial applications with complex interactions among numerous nodes, which are commonly non-negative for representing the inherent non-negativity of node interactions. A Non-negative Latent Factor (NLF) model is able to extract intrinsic features from such data efficiently. However, existing NLF models all adopt a static divergence metric like Euclidean distance or α-\b{eta} divergence to build its learning objective, which greatly restricts its scalability of accurately representing HDI data from different domains. Aiming at addressing this issue, this study presents an Adaptive Divergence-based Non-negative Latent Factor (ADNLF) model with three-fold ideas: a) generalizing the objective function with the α-\b{eta}-divergence to expand its potential of representing various HDI data; b) adopting a non-negative bridging function to connect the optimization variables with output latent factors for fulfilling the non-negativity constraints constantly; and c) making the divergence parameters adaptive through particle swarm optimization, thereby facilitating adaptive divergence in the learning objective to achieve high scalability. Empirical studies are conducted on four HDI datasets from real applications, whose results demonstrate that in comparison with state-of-the-art NLF models, an ADNLF model achieves significantly higher estimation accuracy for missing data of an HDI dataset with high computational efficiency.

1.2MMSep 9, 2021Code

Online Enhanced Semantic Hashing: Towards Effective and Efficient Retrieval for Streaming Multi-Modal Data

Xiao-Ming Wu, Xin Luo, Yu-Wei Zhan et al.

With the vigorous development of multimedia equipment and applications, efficient retrieval of large-scale multi-modal data has become a trendy research topic. Thereinto, hashing has become a prevalent choice due to its retrieval efficiency and low storage cost. Although multi-modal hashing has drawn lots of attention in recent years, there still remain some problems. The first point is that existing methods are mainly designed in batch mode and not able to efficiently handle streaming multi-modal data. The second point is that all existing online multi-modal hashing methods fail to effectively handle unseen new classes which come continuously with streaming data chunks. In this paper, we propose a new model, termed Online enhAnced SemantIc haShing (OASIS). We design novel semantic-enhanced representation for data, which could help handle the new coming classes, and thereby construct the enhanced semantic objective function. An efficient and effective discrete online optimization algorithm is further proposed for OASIS. Extensive experiments show that our method can exceed the state-of-the-art models. For good reproducibility and benefiting the community, our code and data are already available in supplementary material and will be made publicly available.

1.2LGFeb 25, 2020

Dual Graph Representation Learning

Huiling Zhu, Xin Luo, Hankz Hankui Zhuo

Graph representation learning embeds nodes in large graphs as low-dimensional vectors and is of great benefit to many downstream applications. Most embedding frameworks, however, are inherently transductive and unable to generalize to unseen nodes or learn representations across different graphs. Although inductive approaches can generalize to unseen nodes, they neglect different contexts of nodes and cannot learn node embeddings dually. In this paper, we present a context-aware unsupervised dual encoding framework, \textbf{CADE}, to generate representations of nodes by combining real-time neighborhoods with neighbor-attentioned representation, and preserving extra memory of known nodes. We exhibit that our approach is effective by comparing to state-of-the-art methods.

1.2SINov 21, 2016

Rising Novelties on Evolving Networks: Recent Behavior Dominant and Non-Dominant Model

Khushnood Abbas

Novelty attracts attention like popularity. Hence predicting novelty is as important as popularity. Novelty is the side effect of competition and aging in evolving systems. Recent behavior or recent link gain in networks plays an important role in emergence or trend. We exploited this wisdom and came up with two models considering different scenarios and systems. Where recent behavior dominates over total behavior (total link gain) in the first one, and recent behavior is as important as total behavior for future link gain in second one. It suppose that random walker walks on a network and can jump to any node, the probablity of jumping or making connection to other node is based on which node is recently more active or receiving more links. In our assumption random walker can also jump to node which is already popular but recently not popular. We are able to predict rising novelties or popular nodes which is generally suppressed under preferential attachment effect. To show performance of our model we have conducted experiments on four real data sets namely, MovieLens, Netflix, Facebook and Arxiv High Energy Physics paper citation. For testing our model we used four information retrieval indices namely Precision, Novelty, Area Under Receiving Operating Characteristic(AUC) and Kendal's rank correlation coefficient. We have used four benchmark models for validating our proposed models. Although our model doesn't perform better in all the cases but, it has theoretical significance in working better for recent behavior dominant systems.