Yongjie Li

CV
h-index33
29papers
279citations
Novelty44%
AI Score51

29 Papers

CVJul 23, 2022
Orientation and Context Entangled Network for Retinal Vessel Segmentation

Xinxu Wei, Kaifu Yang, Danilo Bzdok et al.

Most of the existing deep learning based methods for vessel segmentation neglect two important aspects of retinal vessels, one is the orientation information of vessels, and the other is the contextual information of the whole fundus region. In this paper, we propose a robust Orientation and Context Entangled Network (denoted as OCE-Net), which has the capability of extracting complex orientation and context information of the blood vessels. To achieve complex orientation aware, a Dynamic Complex Orientation Aware Convolution (DCOA Conv) is proposed to extract complex vessels with multiple orientations for improving the vessel continuity. To simultaneously capture the global context information and emphasize the important local information, a Global and Local Fusion Module (GLFM) is developed to simultaneously model the long-range dependency of vessels and focus sufficient attention on local thin vessels. A novel Orientation and Context Entangled Non-local (OCE-NL) module is proposed to entangle the orientation and context information together. In addition, an Unbalanced Attention Refining Module (UARM) is proposed to deal with the unbalanced pixel numbers of background, thick and thin vessels. Extensive experiments were performed on several commonly used datasets (DRIVE, STARE and CHASEDB1) and some more challenging datasets (AV-WIDE, UoA-DR, RFMiD and UK Biobank). The ablation study shows that the proposed method achieves promising performance on maintaining the continuity of thin vessels and the comparative experiments demonstrate that our OCE-Net can achieve state-of-the-art performance on retinal vessel segmentation.

CVOct 2, 2022
Unsupervised Visual Odometry and Action Integration for PointGoal Navigation in Indoor Environment

Yijun Cao, Xianshi Zhang, Fuya Luo et al.

PointGoal navigation in indoor environment is a fundamental task for personal robots to navigate to a specified point. Recent studies solved this PointGoal navigation task with near-perfect success rate in photo-realistically simulated environments, under the assumptions with noiseless actuation and most importantly, perfect localization with GPS and compass sensors. However, accurate GPS signalis difficult to be obtained in real indoor environment. To improve the PointGoal navigation accuracy without GPS signal, we use visual odometry (VO) and propose a novel action integration module (AIM) trained in unsupervised manner. Sepecifically, unsupervised VO computes the relative pose of the agent from the re-projection error of two adjacent frames, and then replaces the accurate GPS signal with the path integration. The pseudo position estimated by VO is used to train action integration which assists agent to update their internal perception of location and helps improve the success rate of navigation. The training and inference process only use RGB, depth, collision as well as self-action information. The experiments show that the proposed system achieves satisfactory results and outperforms the partially supervised learning algorithms on the popular Gibson dataset.

CVJul 23, 2022
Deep Pneumonia: Attention-Based Contrastive Learning for Class-Imbalanced Pneumonia Lesion Recognition in Chest X-rays

Xinxu Wei, Haohan Bai, Xianshi Zhang et al.

Computer-aided X-ray pneumonia lesion recognition is important for accurate diagnosis of pneumonia. With the emergence of deep learning, the identification accuracy of pneumonia has been greatly improved, but there are still some challenges due to the fuzzy appearance of chest X-rays. In this paper, we propose a deep learning framework named Attention-Based Contrastive Learning for Class-Imbalanced X-Ray Pneumonia Lesion Recognition (denoted as Deep Pneumonia). We adopt self-supervised contrastive learning strategy to pre-train the model without using extra pneumonia data for fully mining the limited available dataset. In order to leverage the location information of the lesion area that the doctor has painstakingly marked, we propose mask-guided hard attention strategy and feature learning with contrastive regulation strategy which are applied on the attention map and the extracted features respectively to guide the model to focus more attention on the lesion area where contains more discriminative features for improving the recognition performance. In addition, we adopt Class-Balanced Loss instead of traditional Cross-Entropy as the loss function of classification to tackle the problem of serious class imbalance between different classes of pneumonia in the dataset. The experimental results show that our proposed framework can be used as a reliable computer-aided pneumonia diagnosis system to assist doctors to better diagnose pneumonia cases accurately.

CVFeb 13
Beyond Benchmarks of IUGC: Rethinking Requirements of Deep Learning Methods for Intrapartum Ultrasound Biometry from Fetal Ultrasound Videos

Jieyun Bai, Zihao Zhou, Yitong Tang et al.

A substantial proportion (45\%) of maternal deaths, neonatal deaths, and stillbirths occur during the intrapartum phase, with a particularly high burden in low- and middle-income countries. Intrapartum biometry plays a critical role in monitoring labor progression; however, the routine use of ultrasound in resource-limited settings is hindered by a shortage of trained sonographers. To address this challenge, the Intrapartum Ultrasound Grand Challenge (IUGC), co-hosted with MICCAI 2024, was launched. The IUGC introduces a clinically oriented multi-task automatic measurement framework that integrates standard plane classification, fetal head-pubic symphysis segmentation, and biometry, enabling algorithms to exploit complementary task information for more accurate estimation. Furthermore, the challenge releases the largest multi-center intrapartum ultrasound video dataset to date, comprising 774 videos (68,106 frames) collected from three hospitals, providing a robust foundation for model training and evaluation. In this study, we present a comprehensive overview of the challenge design, review the submissions from eight participating teams, and analyze their methods from five perspectives: preprocessing, data augmentation, learning strategy, model architecture, and post-processing. In addition, we perform a systematic analysis of the benchmark results to identify key bottlenecks, explore potential solutions, and highlight open challenges for future research. Although encouraging performance has been achieved, our findings indicate that the field remains at an early stage, and further in-depth investigation is required before large-scale clinical deployment. All benchmark solutions and the complete dataset have been publicly released to facilitate reproducible research and promote continued advances in automatic intrapartum ultrasound biometry.

IVSep 17, 2024
Retinal Vessel Segmentation with Deep Graph and Capsule Reasoning

Xinxu Wei, Xi Lin, Haiyun Liu et al.

Effective retinal vessel segmentation requires a sophisticated integration of global contextual awareness and local vessel continuity. To address this challenge, we propose the Graph Capsule Convolution Network (GCC-UNet), which merges capsule convolutions with CNNs to capture both local and global features. The Graph Capsule Convolution operator is specifically designed to enhance the representation of global context, while the Selective Graph Attention Fusion module ensures seamless integration of local and global information. To further improve vessel continuity, we introduce the Bottleneck Graph Attention module, which incorporates Channel-wise and Spatial Graph Attention mechanisms. The Multi-Scale Graph Fusion module adeptly combines features from various scales. Our approach has been rigorously validated through experiments on widely used public datasets, with ablation studies confirming the efficacy of each component. Comparative results highlight GCC-UNet's superior performance over existing methods, setting a new benchmark in retinal vessel segmentation. Notably, this work represents the first integration of vanilla, graph, and capsule convolutional techniques in the domain of medical image segmentation.

IVJun 30, 2021Code
BLNet: A Fast Deep Learning Framework for Low-Light Image Enhancement with Noise Removal and Color Restoration

Xinxu Wei, Xianshi Zhang, Shisen Wang et al.

Images obtained in real-world low-light conditions are not only low in brightness, but they also suffer from many other types of degradation, such as color bias, unknown noise, detail loss and halo artifacts. In this paper, we propose a very fast deep learning framework called Bringing the Lightness (denoted as BLNet) that consists of two U-Nets with a series of well-designed loss functions to tackle all of the above degradations. Based on Retinex Theory, the decomposition net in our model can decompose low-light images into reflectance and illumination and remove noise in the reflectance during the decomposition phase. We propose a Noise and Color Bias Control module (NCBC Module) that contains a convolutional neural network and two loss functions (noise loss and color loss). This module is only used to calculate the loss functions during the training phase, so our method is very fast during the test phase. This module can smooth the reflectance to achieve the purpose of noise removal while preserving details and edge information and controlling color bias. We propose a network that can be trained to learn the mapping between low-light and normal-light illumination and enhance the brightness of images taken in low-light illumination. We train and evaluate the performance of our proposed model over the real-world Low-Light (LOL) dataset), and we also test our model over several other frequently used datasets (LIME, DICM and MEF datasets). We conduct extensive experiments to demonstrate that our approach achieves a promising effect with good rubustness and generalization and outperforms many other state-of-the-art methods qualitatively and quantitatively. Our method achieves high speed because we use loss functions instead of introducing additional denoisers for noise removal and color correction. The code and model are available at https://github.com/weixinxu666/BLNet.

CVApr 29, 2021Code
Thermal Infrared Image Colorization for Nighttime Driving Scenes with Top-Down Guided Attention

Fuya Luo, Yunhan Li, Guang Zeng et al.

Benefitting from insensitivity to light and high penetration of foggy environments, infrared cameras are widely used for sensing in nighttime traffic scenes. However, the low contrast and lack of chromaticity of thermal infrared (TIR) images hinder the human interpretation and portability of high-level computer vision algorithms. Colorization to translate a nighttime TIR image into a daytime color (NTIR2DC) image may be a promising way to facilitate nighttime scene perception. Despite recent impressive advances in image translation, semantic encoding entanglement and geometric distortion in the NTIR2DC task remain under-addressed. Hence, we propose a toP-down attEntion And gRadient aLignment based GAN, referred to as PearlGAN. A top-down guided attention module and an elaborate attentional loss are first designed to reduce the semantic encoding ambiguity during translation. Then, a structured gradient alignment loss is introduced to encourage edge consistency between the translated and input images. In addition, pixel-level annotation is carried out on a subset of FLIR and KAIST datasets to evaluate the semantic preservation performance of multiple translation methods. Furthermore, a new metric is devised to evaluate the geometric consistency in the translation process. Extensive experiments demonstrate the superiority of the proposed PearlGAN over other image translation methods for the NTIR2DC task. The source code and labeled segmentation masks will be available at \url{https://github.com/FuyaLuo/PearlGAN/}.

CVJun 5, 2025
Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation

Yijun Cao, Fuya Luo, Yongjie Li

Unsupervised monocular depth learning generally relies on the photometric relation among temporally adjacent images. Most of previous works use both mean absolute error (MAE) and structure similarity index measure (SSIM) with conventional form as training loss. However, they ignore the effect of different components in the SSIM function and the corresponding hyperparameters on the training. To address these issues, this work proposes a new form of SSIM. Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM. The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation. We conduct extensive experiments to determine the relatively optimal combination of parameters for our new SSIM. Based on the popular MonoDepth approach, the optimized SSIM loss function can remarkably outperform the baseline on the KITTI-2015 outdoor dataset.

CVMay 14, 2024
Vector-Symbolic Architecture for Event-Based Optical Flow

Hongzhi You, Yijun Cao, Wei Yuan et al.

From a perspective of feature matching, optical flow estimation for event cameras involves identifying event correspondences by comparing feature similarity across accompanying event frames. In this work, we introduces an effective and robust high-dimensional (HD) feature descriptor for event frames, utilizing Vector Symbolic Architectures (VSA). The topological similarity among neighboring variables within VSA contributes to the enhanced representation similarity of feature descriptors for flow-matching points, while its structured symbolic representation capacity facilitates feature fusion from both event polarities and multiple spatial scales. Based on this HD feature descriptor, we propose a novel feature matching framework for event-based optical flow, encompassing both model-based (VSA-Flow) and self-supervised learning (VSA-SM) methods. In VSA-Flow, accurate optical flow estimation validates the effectiveness of HD feature descriptors. In VSA-SM, a novel similarity maximization method based on the HD feature descriptor is proposed to learn optical flow in a self-supervised way from events alone, eliminating the need for auxiliary grayscale images. Evaluation results demonstrate that our VSA-based method achieves superior accuracy in comparison to both model-based and self-supervised learning methods on the DSEC benchmark, while remains competitive among both methods on the MVSEC benchmark. This contribution marks a significant advancement in event-based optical flow within the feature matching methodology.

CVJun 9, 2025
APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Bowen Liu, Weiyi Zhang, Peranut Chotcomwongse et al.

Optical Coherence Tomography (OCT) provides high-resolution, 3D, and non-invasive visualization of retinal layers in vivo, serving as a critical tool for lesion localization and disease diagnosis. However, its widespread adoption is limited by equipment costs and the need for specialized operators. In comparison, 2D color fundus photography offers faster acquisition and greater accessibility with less dependence on expensive devices. Although generative artificial intelligence has demonstrated promising results in medical image synthesis, translating 2D fundus images into 3D OCT images presents unique challenges due to inherent differences in data dimensionality and biological information between modalities. To advance generative models in the fundus-to-3D-OCT setting, the Asia Pacific Tele-Ophthalmology Society (APTOS-2024) organized a challenge titled Artificial Intelligence-based OCT Generation from Fundus Images. This paper details the challenge framework (referred to as APTOS-2024 Challenge), including: the benchmark dataset, evaluation methodology featuring two fidelity metrics-image-based distance (pixel-level OCT B-scan similarity) and video-based distance (semantic-level volumetric consistency), and analysis of top-performing solutions. The challenge attracted 342 participating teams, with 42 preliminary submissions and 9 finalists. Leading methodologies incorporated innovations in hybrid data preprocessing or augmentation (cross-modality collaborative paradigms), pre-training on external ophthalmic imaging datasets, integration of vision foundation models, and model architecture improvement. The APTOS-2024 Challenge is the first benchmark demonstrating the feasibility of fundus-to-3D-OCT synthesis as a potential solution for improving ophthalmic care accessibility in under-resourced healthcare settings, while helping to expedite medical research and clinical applications.

CLApr 5, 2025
Cross-Asset Risk Management: Integrating LLMs for Real-Time Monitoring of Equity, Fixed Income, and Currency Markets

Jie Yang, Yiqiu Tang, Yongjie Li et al.

Large language models (LLMs) have emerged as powerful tools in the field of finance, particularly for risk management across different asset classes. In this work, we introduce a Cross-Asset Risk Management framework that utilizes LLMs to facilitate real-time monitoring of equity, fixed income, and currency markets. This innovative approach enables dynamic risk assessment by aggregating diverse data sources, ultimately enhancing decision-making processes. Our model effectively synthesizes and analyzes market signals to identify potential risks and opportunities while providing a holistic view of asset classes. By employing advanced analytics, we leverage LLMs to interpret financial texts, news articles, and market reports, ensuring that risks are contextualized within broader market narratives. Extensive backtesting and real-time simulations validate the framework, showing increased accuracy in predicting market shifts compared to conventional methods. The focus on real-time data integration enhances responsiveness, allowing financial institutions to manage risks adeptly under varying market conditions and promoting financial stability through the advanced application of LLMs in risk analysis.

CLJul 25, 2025
Legal Document Summarization: Enhancing Judicial Efficiency through Automation Detection

Yongjie Li, Ruilin Nong, Jianan Liu et al.

Legal document summarization represents a significant advancement towards improving judicial efficiency through the automation of key information detection. Our approach leverages state-of-the-art natural language processing techniques to meticulously identify and extract essential data from extensive legal texts, which facilitates a more efficient review process. By employing advanced machine learning algorithms, the framework recognizes underlying patterns within judicial documents to create precise summaries that encapsulate the crucial elements. This automation alleviates the burden on legal professionals, concurrently reducing the likelihood of overlooking vital information that could lead to errors. Through comprehensive experiments conducted with actual legal datasets, we demonstrate the capability of our method to generate high-quality summaries while preserving the integrity of the original content and enhancing processing times considerably. The results reveal marked improvements in operational efficiency, allowing legal practitioners to direct their efforts toward critical analytical and decision-making activities instead of manual reviews. This research highlights promising technology-driven strategies that can significantly alter workflow dynamics within the legal sector, emphasizing the role of automation in refining judicial processes.

CYJul 25, 2025
Adaptive Learning Systems: Personalized Curriculum Design Using LLM-Powered Analytics

Yongjie Li, Ruilin Nong, Jianan Liu et al.

Large language models (LLMs) are revolutionizing the field of education by enabling personalized learning experiences tailored to individual student needs. In this paper, we introduce a framework for Adaptive Learning Systems that leverages LLM-powered analytics for personalized curriculum design. This innovative approach uses advanced machine learning to analyze real-time data, allowing the system to adapt learning pathways and recommend resources that align with each learner's progress. By continuously assessing students, our framework enhances instructional strategies, ensuring that the materials presented are relevant and engaging. Experimental results indicate a marked improvement in both learner engagement and knowledge retention when using a customized curriculum. Evaluations conducted across varied educational environments demonstrate the framework's flexibility and positive influence on learning outcomes, potentially reshaping conventional educational practices into a more adaptive and student-centered model.

CLApr 5, 2025
Dynamic Hedging Strategies in Derivatives Markets with LLM-Driven Sentiment and News Analytics

Jie Yang, Yiqiu Tang, Yongjie Li et al.

Dynamic hedging strategies are essential for effective risk management in derivatives markets, where volatility and market sentiment can greatly impact performance. This paper introduces a novel framework that leverages large language models (LLMs) for sentiment analysis and news analytics to inform hedging decisions. By analyzing textual data from diverse sources like news articles, social media, and financial reports, our approach captures critical sentiment indicators that reflect current market conditions. The framework allows for real-time adjustments to hedging strategies, adapting positions based on continuous sentiment signals. Backtesting results on historical derivatives data reveal that our dynamic hedging strategies achieve superior risk-adjusted returns compared to conventional static approaches. The incorporation of LLM-driven sentiment analysis into hedging practices presents a significant advancement in decision-making processes within derivatives trading. This research showcases how sentiment-informed dynamic hedging can enhance portfolio management and effectively mitigate associated risks.

NCDec 10, 2024
Primary visual cortex contributes to color constancy by predicting rather than discounting the illuminant: evidence from a computational study

Shaobing Gao, Yongjie Li

Color constancy (CC) is an important ability of the human visual system to stably perceive the colors of objects despite considerable changes in the color of the light illuminating them. While increasing evidence from the field of neuroscience supports that multiple levels of the visual system contribute to the realization of CC, how the primary visual cortex (V1) plays role in CC is not fully resolved. In specific, double-opponent (DO) neurons in V1 have been thought to contribute to realizing a degree of CC, but the computational mechanism is not clear. We build an electrophysiologically based V1 neural model to learn the color of the light source from a natural image dataset with the ground truth illuminants as the labels. Based on the qualitative and quantitative analysis of the responsive properties of the learned model neurons, we found that both the spatial structures and color weights of the receptive fields of the learned model neurons are quite similar to those of the simple and DO neurons recorded in V1. Computationally, DO cells perform more robustly than the simple cells in V1 for illuminant prediction. Therefore, this work provides computational evidence supporting that V1 DO neurons serve to realize color constancy by encoding the illuminant,which is contradictory to the common hypothesis that V1 contributes to CC by discounting the illuminant using its DO cells. This evidence is expected to not only help resolve the visual mechanisms of CC, but also provide inspiration to develop more effective computer vision models.

CVNov 14, 2024
VPBSD:Vessel-Pattern-Based Semi-Supervised Distillation for Efficient 3D Microscopic Cerebrovascular Segmentation

Xi Lin, Shixuan Zhao, Xinxu Wei et al.

3D microscopic cerebrovascular images are characterized by their high resolution, presenting significant annotation challenges, large data volumes, and intricate variations in detail. Together, these factors make achieving high-quality, efficient whole-brain segmentation particularly demanding. In this paper, we propose a novel Vessel-Pattern-Based Semi-Supervised Distillation pipeline (VpbSD) to address the challenges of 3D microscopic cerebrovascular segmentation. This pipeline initially constructs a vessel-pattern codebook that captures diverse vascular structures from unlabeled data during the teacher model's pretraining phase. In the knowledge distillation stage, the codebook facilitates the transfer of rich knowledge from a heterogeneous teacher model to a student model, while the semi-supervised approach further enhances the student model's exposure to diverse learning samples. Experimental results on real-world data, including comparisons with state-of-the-art methods and ablation studies, demonstrate that our pipeline and its individual components effectively address the challenges inherent in microscopic cerebrovascular segmentation.

CVNov 22, 2021
Learning Generalized Visual Odometry Using Position-Aware Optical Flow and Geometric Bundle Adjustment

Yijun Cao, Xianshi Zhang, Fuya Luo et al.

Recent visual odometry (VO) methods incorporating geometric algorithm into deep-learning architecture have shown outstanding performance on the challenging monocular VO task. Despite encouraging results are shown, previous methods ignore the requirement of generalization capability under noisy environment and various scenes. To address this challenging issue, this work first proposes a novel optical flow network (PANet). Compared with previous methods that predict optical flow as a direct regression task, our PANet computes optical flow by predicting it into the discrete position space with optical flow probability volume, and then converting it to optical flow. Next, we improve the bundle adjustment module to fit the self-supervised training pipeline by introducing multiple sampling, ego-motion initialization, dynamic damping factor adjustment, and Jacobi matrix weighting. In addition, a novel normalized photometric loss function is advanced to improve the depth estimation accuracy. The experiments show that the proposed system not only achieves comparable performance with other state-of-the-art self-supervised learning-based methods on the KITTI dataset, but also significantly improves the generalization capability compared with geometry-based, learning-based and hybrid VO systems on the noisy KITTI and the challenging outdoor (KAIST) scenes.

IVOct 6, 2021
TSN-CA: A Two-Stage Network with Channel Attention for Low-Light Image Enhancement

Xinxu Wei, Xianshi Zhang, Shisen Wang et al.

Low-light image enhancement is a challenging low-level computer vision task because after we enhance the brightness of the image, we have to deal with amplified noise, color distortion, detail loss, blurred edges, shadow blocks and halo artifacts. In this paper, we propose a Two-Stage Network with Channel Attention (denoted as TSN-CA) to enhance the brightness of the low-light image and restore the enhanced images from various kinds of degradation. In the first stage, we enhance the brightness of the low-light image in HSV space and use the information of H and S channels to help the recovery of details in V channel. In the second stage, we integrate Channel Attention (CA) mechanism into the skip connection of U-Net in order to restore the brightness-enhanced image from severe kinds of degradation in RGB space. We train and evaluate the performance of our proposed model on the LOL real-world and synthetic datasets. In addition, we test our model on several other commonly used datasets without Ground-Truth. We conduct extensive experiments to demonstrate that our method achieves excellent effect on brightness enhancement as well as denoising, details preservation and halo artifacts elimination. Our method outperforms many other state-of-the-art methods qualitatively and quantitatively.

IVOct 5, 2021
DA-DRN: Degradation-Aware Deep Retinex Network for Low-Light Image Enhancement

Xinxu Wei, Xianshi Zhang, Shisen Wang et al.

Images obtained in real-world low-light conditions are not only low in brightness, but they also suffer from many other types of degradation, such as color distortion, unknown noise, detail loss and halo artifacts. In this paper, we propose a Degradation-Aware Deep Retinex Network (denoted as DA-DRN) for low-light image enhancement and tackle the above degradation. Based on Retinex Theory, the decomposition net in our model can decompose low-light images into reflectance and illumination maps and deal with the degradation in the reflectance during the decomposition phase directly. We propose a Degradation-Aware Module (DA Module) which can guide the training process of the decomposer and enable the decomposer to be a restorer during the training phase without additional computational cost in the test phase. DA Module can achieve the purpose of noise removal while preserving detail information into the illumination map as well as tackle color distortion and halo artifacts. We introduce Perceptual Loss to train the enhancement network to generate the brightness-improved illumination maps which are more consistent with human visual perception. We train and evaluate the performance of our proposed model over the LOL real-world and LOL synthetic datasets, and we also test our model over several other frequently used datasets without Ground-Truth (LIME, DICM, MEF and NPE datasets). We conduct extensive experiments to demonstrate that our approach achieves a promising effect with good rubustness and generalization and outperforms many other state-of-the-art methods qualitatively and quantitatively. Our method only takes 7 ms to process an image with 600x400 resolution on a TITAN Xp GPU.

MLDec 7, 2015
Clustering by Deep Nearest Neighbor Descent (D-NND): A Density-based Parameter-Insensitive Clustering Method

Teng Qiu, Yongjie Li

Most density-based clustering methods largely rely on how well the underlying density is estimated. However, density estimation itself is also a challenging problem, especially the determination of the kernel bandwidth. A large bandwidth could lead to the over-smoothed density estimation in which the number of density peaks could be less than the true clusters, while a small bandwidth could lead to the under-smoothed density estimation in which spurious density peaks, or called the "ripple noise", would be generated in the estimated density. In this paper, we propose a density-based hierarchical clustering method, called the Deep Nearest Neighbor Descent (D-NND), which could learn the underlying density structure layer by layer and capture the cluster structure at the same time. The over-smoothed density estimation could be largely avoided and the negative effect of the under-estimated cases could be also largely reduced. Overall, D-NND presents not only the strong capability of discovering the underlying cluster structure but also the remarkable reliability due to its insensitivity to parameters.

MLSep 9, 2015
Clustering by Hierarchical Nearest Neighbor Descent (H-NND)

Teng Qiu, Yongjie Li

Previously in 2014, we proposed the Nearest Descent (ND) method, capable of generating an efficient Graph, called the in-tree (IT). Due to some beautiful and effective features, this IT structure proves well suited for data clustering. Although there exist some redundant edges in IT, they usually have salient features and thus it is not hard to remove them. Subsequently, in order to prevent the seemingly redundant edges from occurring, we proposed the Nearest Neighbor Descent (NND) by adding the "Neighborhood" constraint on ND. Consequently, clusters automatically emerged, without the additional requirement of removing the redundant edges. However, NND proved still not perfect, since it brought in a new yet worse problem, the "over-partitioning" problem. Now, in this paper, we propose a method, called the Hierarchical Nearest Neighbor Descent (H-NND), which overcomes the over-partitioning problem of NND via using the hierarchical strategy. Specifically, H-NND uses ND to effectively merge the over-segmented sub-graphs or clusters that NND produces. Like ND, H-NND also generates the IT structure, in which the redundant edges once again appear. This seemingly comes back to the situation that ND faces. However, compared with ND, the redundant edges in the IT structure generated by H-NND generally become more salient, thus being much easier and more reliable to be identified even by the simplest edge-removing method which takes the edge length as the only measure. In other words, the IT structure constructed by H-NND becomes more fitted for data clustering. We prove this on several clustering datasets of varying shapes, dimensions and attributes. Besides, compared with ND, H-NND generally takes less computation time to construct the IT data structure for the input data.

MLJul 29, 2015
IT-Dendrogram: A New Member of the In-Tree (IT) Clustering Family

Teng Qiu, Yongjie Li

Previously, we proposed a physically-inspired method to construct data points into an effective in-tree (IT) structure, in which the underlying cluster structure in the dataset is well revealed. Although there are some edges in the IT structure requiring to be removed, such undesired edges are generally distinguishable from other edges and thus are easy to be determined. For instance, when the IT structures for the 2-dimensional (2D) datasets are graphically presented, those undesired edges can be easily spotted and interactively determined. However, in practice, there are many datasets that do not lie in the 2D Euclidean space, thus their IT structures cannot be graphically presented. But if we can effectively map those IT structures into a visualized space in which the salient features of those undesired edges are preserved, then the undesired edges in the IT structures can still be visually determined in a visualization environment. Previously, this purpose was reached by our method called IT-map. The outstanding advantage of IT-map is that clusters can still be found even with the so-called crowding problem in the embedding. In this paper, we propose another method, called IT-Dendrogram, to achieve the same goal through an effective combination of the IT structure and the single link hierarchical clustering (SLHC) method. Like IT-map, IT-Dendrogram can also effectively represent the IT structures in a visualization environment, whereas using another form, called the Dendrogram. IT-Dendrogram can serve as another visualization method to determine the undesired edges in the IT structures and thus benefit the IT-based clustering analysis. This was demonstrated on several datasets with different shapes, dimensions, and attributes. Unlike IT-map, IT-Dendrogram can always avoid the crowding problem, which could help users make more reliable cluster analysis in certain problems.

CVJun 19, 2015
A general framework for the IT-based clustering methods

Teng Qiu, Yongjie Li

Previously, we proposed a physically inspired rule to organize the data points in a sparse yet effective structure, called the in-tree (IT) graph, which is able to capture a wide class of underlying cluster structures in the datasets, especially for the density-based datasets. Although there are some redundant edges or lines between clusters requiring to be removed by computer, this IT graph has a big advantage compared with the k-nearest-neighborhood (k-NN) or the minimal spanning tree (MST) graph, in that the redundant edges in the IT graph are much more distinguishable and thus can be easily determined by several methods previously proposed by us. In this paper, we propose a general framework to re-construct the IT graph, based on an initial neighborhood graph, such as the k-NN or MST, etc, and the corresponding graph distances. For this general framework, our previous way of constructing the IT graph turns out to be a special case of it. This general framework 1) can make the IT graph capture a wider class of underlying cluster structures in the datasets, especially for the manifolds, and 2) should be more effective to cluster the sparse or graph-based datasets.

MLFeb 17, 2015
Nonparametric Nearest Neighbor Descent Clustering based on Delaunay Triangulation

Teng Qiu, Yongjie Li

In our physically inspired in-tree (IT) based clustering algorithm and the series after it, there is only one free parameter involved in computing the potential value of each point. In this work, based on the Delaunay Triangulation or its dual Voronoi tessellation, we propose a nonparametric process to compute potential values by the local information. This computation, though nonparametric, is relatively very rough, and consequently, many local extreme points will be generated. However, unlike those gradient-based methods, our IT-based methods are generally insensitive to those local extremes. This positively demonstrates the superiority of these parametric (previous) and nonparametric (in this work) IT-based methods.

MLFeb 16, 2015
Clustering by Descending to the Nearest Neighbor in the Delaunay Graph Space

Teng Qiu, Yongjie Li

In our previous works, we proposed a physically-inspired rule to organize the data points into an in-tree (IT) structure, in which some undesired edges are allowed to occur. By removing those undesired or redundant edges, this IT structure is divided into several separate parts, each representing one cluster. In this work, we seek to prevent the undesired edges from arising at the source. Before using the physically-inspired rule, data points are at first organized into a proximity graph which restricts each point to select the optimal directed neighbor just among its neighbors. Consequently, separated in-trees or clusters automatically arise, without redundant edges requiring to be removed.

MLJan 26, 2015
IT-map: an Effective Nonlinear Dimensionality Reduction Method for Interactive Clustering

Teng Qiu, Yongjie Li

Scientists in many fields have the common and basic need of dimensionality reduction: visualizing the underlying structure of the massive multivariate data in a low-dimensional space. However, many dimensionality reduction methods confront the so-called "crowding problem" that clusters tend to overlap with each other in the embedding. Previously, researchers expect to avoid that problem and seek to make clusters maximally separated in the embedding. However, the proposed in-tree (IT) based method, called IT-map, allows clusters in the embedding to be locally overlapped, while seeking to make them distinguishable by some small yet key parts. IT-map provides a simple, effective and novel solution to cluster-preserving mapping, which makes it possible to cluster the original data points interactively and thus should be of general meaning in science and engineering.

LGJan 18, 2015
Clustering based on the In-tree Graph Structure and Affinity Propagation

Teng Qiu, Yongjie Li

A recently proposed clustering method, called the Nearest Descent (ND), can organize the whole dataset into a sparsely connected graph, called the In-tree. This ND-based Intree structure proves able to reveal the clustering structure underlying the dataset, except one imperfect place, that is, there are some undesired edges in this In-tree which require to be removed. Here, we propose an effective way to automatically remove the undesired edges in In-tree via an effective combination of the In-tree structure with affinity propagation (AP). The key for the combination is to add edges between the reachable nodes in In-tree before using AP to remove the undesired edges. The experiments on both synthetic and real datasets demonstrate the effectiveness of the proposed method.

LGDec 24, 2014
An Effective Semi-supervised Divisive Clustering Algorithm

Teng Qiu, Yongjie Li

Nowadays, data are generated massively and rapidly from scientific fields as bioinformatics, neuroscience and astronomy to business and engineering fields. Cluster analysis, as one of the major data analysis tools, is therefore more significant than ever. We propose in this work an effective Semi-supervised Divisive Clustering algorithm (SDC). Data points are first organized by a minimal spanning tree. Next, this tree structure is transitioned to the in-tree structure, and then divided into sub-trees under the supervision of the labeled data, and in the end, all points in the sub-trees are directly associated with specific cluster centers. SDC is fully automatic, non-iterative, involving no free parameter, insensitive to noise, able to detect irregularly shaped cluster structures, applicable to the data sets of high dimensionality and different attributes. The power of SDC is demonstrated on several datasets.

LGDec 7, 2014
Nearest Descent, In-Tree, and Clustering

Teng Qiu, Kaifu Yang, Chaoyi Li et al.

In this paper, we propose a physically inspired graph-theoretical clustering method, which first makes the data points organized into an attractive graph, called In-Tree, via a physically inspired rule, called Nearest Descent (ND). In particular, the rule of ND works to select the nearest node in the descending direction of potential as the parent node of each node, which is in essence different from the classical Gradient Descent or Steepest Descent. The constructed In-Tree proves a very good candidate for clustering due to its particular features and properties. In the In-Tree, the original clustering problem is reduced to a problem of removing a very few of undesired edges from this graph. Pleasingly, the undesired edges in In-Tree are so distinguishable that they can be easily determined in either automatic or interactive way, which is in stark contrast to the cases in the widely used Minimal Spanning Tree and k-nearest-neighbor graph. The cluster number in the proposed method can be easily determined based on some intermediate plots, and the cluster assignment for each node is easily made by quickly searching its root node in each sub-graph (also an In-Tree). The proposed method is extensively evaluated on both synthetic and real-world datasets. Overall, the proposed clustering method is a density-based one, but shows significant differences and advantages in comparison to the traditional ones. The proposed method is simple yet efficient and reliable, and is applicable to various datasets with diverse shapes, attributes and any high dimensionality