Wan-Lei Zhao

h-index4

19papers

130citations

Novelty54%

AI Score36

Ranked #120,520 of 201,018 authors (top 60%)#38,477 in CV (top 65%)

19 Papers

CVMar 10, 2022

Online Deep Metric Learning via Mutual Distillation

Gao-Dong Liu, Wan-Lei Zhao, Jie Zhao

Deep metric learning aims to transform input data into an embedding space, where similar samples are close while dissimilar samples are far apart from each other. In practice, samples of new categories arrive incrementally, which requires the periodical augmentation of the learned model. The fine-tuning on the new categories usually leads to poor performance on the old, which is known as "catastrophic forgetting". Existing solutions either retrain the model from scratch or require the replay of old samples during the training. In this paper, a complete online deep metric learning framework is proposed based on mutual distillation for both one-task and multi-task scenarios. Different from the teacher-student framework, the proposed approach treats the old and new learning tasks with equal importance. No preference over the old or new knowledge is caused. In addition, a novel virtual feature estimation approach is proposed to recover the features assumed to be extracted by the old models. It allows the distillation between the new and the old models without the replay of old training samples or the holding of old models during the training. A comprehensive study shows the superior performance of our approach with the support of different backbones.

CVApr 19, 2022

Shape-Aware Monocular 3D Object Detection

Wei Chen, Jie Zhao, Wan-Lei Zhao et al.

The detection of 3D objects through a single perspective camera is a challenging issue. The anchor-free and keypoint-based models receive increasing attention recently due to their effectiveness and simplicity. However, most of these methods are vulnerable to occluded and truncated objects. In this paper, a single-stage monocular 3D object detection model is proposed. An instance-segmentation head is integrated into the model training, which allows the model to be aware of the visible shape of a target object. The detection largely avoids interference from irrelevant regions surrounding the target objects. In addition, we also reveal that the popular IoU-based evaluation metrics, which were originally designed for evaluating stereo or LiDAR-based detection methods, are insensitive to the improvement of monocular 3D object detection algorithms. A novel evaluation metric, namely average depth similarity (ADS) is proposed for the monocular 3D object detection models. Our method outperforms the baseline on both the popular and the proposed evaluation metrics while maintaining real-time efficiency.

CVJun 20, 2025

Class Agnostic Instance-level Descriptor for Visual Instance Search

Qi-Ying Sun, Wan-Lei Zhao, Hui-Ying Xie et al.

Despite the great success of the deep features in content-based image retrieval, the visual instance search remains challenging due to the lack of effective instance-level feature representation. Supervised or weakly supervised object detection methods are not the appropriate solutions due to their poor performance on the unknown object categories. In this paper, based on the feature set output from self-supervised ViT, the instance-level region discovery is modeled as detecting the compact feature subsets in a hierarchical fashion. The hierarchical decomposition results in a hierarchy of instance regions. On the one hand, this kind of hierarchical decomposition well addresses the problem of object embedding and occlusions, which are widely observed in real scenarios. On the other hand, the non-leaf nodes and leaf nodes on the hierarchy correspond to the instance regions in different granularities within an image. Therefore, features in uniform length are produced for these instance regions, which may cover across a dominant image region, an integral of multiple instances, or various individual instances. Such a collection of features allows us to unify the image retrieval, multi-instance search, and instance search into one framework. The empirical studies on three benchmarks show that such an instance-level descriptor remains effective on both the known and unknown object categories. Moreover, the superior performance is achieved on single-instance and multi-instance search, as well as image retrieval tasks.

LGAug 27, 2021

Anomaly Detection on IT Operation Series via Online Matrix Profile

Shi-Ying Lan, Run-Qing Chen, Wan-Lei Zhao

Anomaly detection on time series is a fundamental task in monitoring the Key Performance Indicators (KPIs) of IT systems. Many of the existing approaches in the literature show good performance while requiring a lot of training resources. In this paper, the online matrix profile, which requires no training, is proposed to address this issue. The anomalies are detected by referring to the past subsequence that is the closest to the current one. The distance significance is introduced based on the online matrix profile, which demonstrates a prominent pattern when an anomaly occurs. Another training-free approach spectral residual is integrated into our approach to further enhance the detection accuracy. Moreover, the proposed approach is sped up by at least four times for long time series by the introduced cache strategy. In comparison to the existing approaches, the online matrix profile makes a good trade-off between accuracy and efficiency. More importantly, it is generic to various types of time series in the sense that it works without the constraint from any trained model.

CVJul 11, 2021

Towards Accurate Localization by Instance Search

Yi-Geng Hong, Hui-Chu Xiao, Wan-Lei Zhao

Visual object localization is the key step in a series of object detection tasks. In the literature, high localization accuracy is achieved with the mainstream strongly supervised frameworks. However, such methods require object-level annotations and are unable to detect objects of unknown categories. Weakly supervised methods face similar difficulties. In this paper, a self-paced learning framework is proposed to achieve accurate object localization on the rank list returned by instance search. The proposed framework mines the target instance gradually from the queries and their corresponding top-ranked search results. Since a common instance is shared between the query and the images in the rank list, the target visual instance can be accurately localized even without knowing what the object category is. In addition to performing localization on instance search, the issue of few-shot object detection is also addressed under the same framework. Superior performance over state-of-the-art methods is observed on both tasks.

DCMar 29, 2021

Large-Scale Approximate k-NN Graph Construction on GPU

Hui Wang, Wan-Lei Zhao, Xiangxiang Zeng

k-nearest neighbor graph is a key data structure in many disciplines such as manifold learning, machine learning and information retrieval, etc. NN-Descent was proposed as an effective solution for the graph construction problem. However, it cannot be directly transplanted to GPU due to the intensive memory accesses required in the approach. In this paper, NN-Descent has been redesigned to adapt to the GPU architecture. In particular, the number of memory accesses has been reduced significantly. The redesign fully exploits the parallelism of the GPU hardware. In the meantime, the genericness as well as the simplicity of NN-Descent are well-preserved. In addition, a simple but effective k-NN graph merge approach is presented. It allows two graphs to be merged efficiently on GPUs. More importantly, it makes the construction of high-quality k-NN graphs for out-of-GPU-memory datasets tractable. The results show that our approach is 100-250x faster than single-thread NN-Descent and is 2.5-5x faster than existing GPU-based approaches.

LGMay 19, 2020

k-sums: another side of k-means

Wan-Lei Zhao, Run-Qing Chen, Hui Ye et al.

In this paper, the decades-old clustering method k-means is revisited. The original distortion minimization model of k-means is addressed by a pure stochastic minimization procedure. In each step of the iteration, one sample is tentatively reallocated from one cluster to another. It is moved to another cluster as long as the reallocation allows the sample to be closer to the new centroid. This optimization procedure converges faster to a better local minimum over k-means and many of its variants. This fundamental modification over the k-means loop leads to the redefinition of a family of k-means variants. Moreover, a new target function that minimizes the summation of pairwise distances within clusters is presented. We show that it could be solved under the same stochastic optimization procedure. This minimization procedure built upon two minimization models outperforms k-means and its variants considerably with different settings and on different datasets.

CVApr 24, 2020

Dynamic Sampling for Deep Metric Learning

Chang-Hui Liang, Wan-Lei Zhao, Run-Qing Chen

Deep metric learning maps visually similar images onto nearby locations and visually dissimilar images apart from each other in an embedding manifold. The learning process is mainly based on the supplied image negative and positive training pairs. In this paper, a dynamic sampling strategy is proposed to organize the training pairs in an easy-to-hard order to feed into the network. It allows the network to learn general boundaries between categories from the easy training pairs at its early stages and finalize the details of the model mainly relying on the hard training samples in the later. Compared to the existing training sample mining approaches, the hard samples are mined with little harm to the learned general model. This dynamic sampling strategy is formularized as two simple terms that are compatible with various loss functions. Consistent performance boost is observed when it is integrated with several popular loss functions on fashion search, fine-grained classification, and person re-identification tasks.

CVFeb 1, 2020

Deeply Activated Salient Region for Instance Search

Hui-Chu Xiao, Wan-Lei Zhao, Jie Lin et al.

The performance of instance search depends heavily on the ability to locate and describe a wide variety of object instances in a video/image collection. Due to the lack of proper mechanism in locating instances and deriving feature representation, instance search is generally only effective for retrieving instances of known object categories. In this paper, a simple but effective instance-level feature representation is presented. Different from other approaches, the issues in class-agnostic instance localization and distinctive feature representation are considered. The former is achieved by detecting salient instance regions from an image by a layer-wise back-propagation process. The back-propagation starts from the last convolution layer of a pre-trained CNN that is originally used for classification. The back-propagation proceeds layer-by-layer until it reaches the input layer. This allows the salient instance regions in the input image from both known and unknown categories to be activated. Each activated salient region covers the full or more usually a major range of an instance. The distinctive feature representation is produced by average-pooling on the feature map of certain layer with the detected instance region. Experiments show that such kind of feature representation demonstrates considerably better performance over most of the existing approaches. In addition, we show that the proposed feature descriptor is also suitable for content-based image search.

LGOct 9, 2019

A Joint Model for IT Operation Series Prediction and Anomaly Detection

Run-Qing Chen, Guang-Hui Shi, Wan-Lei Zhao et al.

Status prediction and anomaly detection are two fundamental tasks in automatic IT systems monitoring. In this paper, a joint model Predictor & Anomaly Detector (PAD) is proposed to address these two issues under one framework. In our design, the variational auto-encoder (VAE) and long short-term memory (LSTM) are joined together. The prediction block (LSTM) takes clean input from the reconstructed time series by VAE, which makes it robust to the anomalies and noise for prediction task. In the meantime, the LSTM block maintains the long-term sequential patterns, which are out of the sight of a VAE encoding window. This leads to the better performance of VAE in anomaly detection than it is trained alone. In the whole processing pipeline, the spectral residual analysis is integrated with VAE and LSTM to boost the performance of both. The superior performance on two tasks is confirmed with the experiments on two challenging evaluation benchmarks.

IRAug 2, 2019

On the Merge of k-NN Graph

Wan-Lei Zhao, Hui Wang, Peng-Cheng Lin et al.

k-nearest neighbor graph is a fundamental data structure in many disciplines such as information retrieval, data-mining, pattern recognition, and machine learning, etc. In the literature, considerable research has been focusing on how to efficiently build an approximate k-nearest neighbor graph (k-NN graph) for a fixed dataset. Unfortunately, a closely related issue of how to merge two existing k-NN graphs has been overlooked. In this paper, we address the issue of k-NN graph merging in two different scenarios. In the first scenario, a symmetric merge algorithm is proposed to combine two approximate k-NN graphs. The algorithm facilitates large-scale processing by the efficient merging of k-NN graphs that are produced in parallel. In the second scenario, a joint merge algorithm is proposed to expand an existing k-NN graph with a raw dataset. The algorithm enables the incremental construction of a hierarchical approximate k-NN graph. Superior performance is attained when leveraging the hierarchy for NN search of various data types, dimensionality, and distance measures.

IRApr 3, 2019

Graph based Nearest Neighbor Search: Promises and Failures

Peng-Cheng Lin, Wan-Lei Zhao

Recently, graph based nearest neighbor search gets more and more popular on large-scale retrieval tasks. The attractiveness of this type of approaches lies in its superior performance over most of the known nearest neighbor search approaches as well as its genericness to various metrics. In this paper, the role of two strategies, namely hierarchical structure and graph diversification that are adopted as the key steps in the graph based approaches, is investigated. We find the hierarchical structure could not achieve "much better logarithmic complexity scaling" as it was claimed in the original paper, particularly on high dimensional cases. Moreover, we find that similar high search speed efficiency as the one with hierarchical structure could be achieved with the support of flat k-NN graph after graph diversification. Finally, we point out the difficulty, that is faced by most of the graph based search approaches, is directly linked to "curse of dimensionality".

CVJun 10, 2018

Instance Search via Instance Level Segmentation and Feature Representation

Yu Zhan, Wan-Lei Zhao

Instance search is an interesting task as well as a challenging issue due to the lack of effective feature representation. In this paper, an instance level feature representation built upon fully convolutional instance-aware segmentation is proposed. The feature is ROI-pooled from the segmented instance region. So that instances in various sizes and layouts are represented by deep features in uniform length. This representation is further enhanced by the use of deformable ResNeXt blocks. Superior performance is observed in terms of its distinctiveness and scalability on a challenging evaluation dataset built by ourselves. In addition, the proposed enhancement on the network structure also shows superior performance on the instance segmentation task.

CVApr 12, 2018

Clustering via Boundary Erosion

Cheng-Hao Deng, Wan-Lei Zhao

Clustering analysis identifies samples as groups based on either their mutual closeness or homogeneity. In order to detect clusters in arbitrary shapes, a novel and generic solution based on boundary erosion is proposed. The clusters are assumed to be separated by relatively sparse regions. The samples are eroded sequentially according to their dynamic boundary densities. The erosion starts from low density regions, invading inwards, until all the samples are eroded out. By this manner, boundaries between different clusters become more and more apparent. It therefore offers a natural and powerful way to separate the clusters when the boundaries between them are hard to be drawn at once. With the sequential order of being eroded, the sequential boundary levels are produced, following which the clusters in arbitrary shapes are automatically reconstructed. As demonstrated across various clustering tasks, it is able to outperform most of the state-of-the-art algorithms and its performance is nearly perfect in some scenarios.

IRApr 9, 2018

Approximate k-NN Graph Construction: a Generic Online Approach

Wan-Lei Zhao, Hui Wang, Chong-Wah Ngo

Nearest neighbor search and k-nearest neighbor graph construction are two fundamental issues arise from many disciplines such as multimedia information retrieval, data-mining and machine learning. They become more and more imminent given the big data emerge in various fields in recent years. In this paper, a simple but effective solution both for approximate k-nearest neighbor search and approximate k-nearest neighbor graph construction is presented. These two issues are addressed jointly in our solution. On the one hand, the approximate k-nearest neighbor graph construction is treated as a search task. Each sample along with its k-nearest neighbors are joined into the k-nearest neighbor graph by performing the nearest neighbor search sequentially on the graph under construction. On the other hand, the built k-nearest neighbor graph is used to support k-nearest neighbor search. Since the graph is built online, the dynamic update on the graph, which is not possible from most of the existing solutions, is supported. This solution is feasible for various distance measures. Its effectiveness both as k-nearest neighbor construction and k-nearest neighbor search approaches is verified across different types of data in different scales, various dimensions and under different metrics.

LGMay 4, 2017

Fast k-means based on KNN Graph

Cheng-Hao Deng, Wan-Lei Zhao

In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well known that the processing bottleneck of k-means lies in the operation of seeking closest centroid in each iteration. In this paper, a novel solution towards the scalability issue of k-means is presented. In the proposal, k-means is supported by an approximate k-nearest neighbors graph. In the k-means iteration, each data sample is only compared to clusters that its nearest neighbors reside. Since the number of nearest neighbors we consider is much less than k, the processing cost in this step becomes minor and irrelevant to k. The processing bottleneck is therefore overcome. The most interesting thing is that k-nearest neighbor graph is constructed by iteratively calling the fast $k$-means itself. Comparing with existing fast k-means variants, the proposed algorithm achieves hundreds to thousands times speed-up while maintaining high clustering quality. As it is tested on 10 million 512-dimensional data, it takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the same scale of clustering, it would take 3 years for traditional k-means.

CVApr 28, 2017

Object Discovery via Cohesion Measurement

Guanjun Guo, Hanzi Wang, Wan-Lei Zhao et al.

Color and intensity are two important components in an image. Usually, groups of image pixels, which are similar in color or intensity, are an informative representation for an object. They are therefore particularly suitable for computer vision tasks, such as saliency detection and object proposal generation. However, image pixels, which share a similar real-world color, may be quite different since colors are often distorted by intensity. In this paper, we reinvestigate the affinity matrices originally used in image segmentation methods based on spectral clustering. A new affinity matrix, which is robust to color distortions, is formulated for object discovery. Moreover, a Cohesion Measurement (CM) for object regions is also derived based on the formulated affinity matrix. Based on the new Cohesion Measurement, a novel object discovery method is proposed to discover objects latent in an image by utilizing the eigenvectors of the affinity matrix. Then we apply the proposed method to both saliency detection and object proposal generation. Experimental results on several evaluation benchmarks demonstrate that the proposed CM based method has achieved promising performance for these two tasks.

CVJan 30, 2017

Scalable Nearest Neighbor Search based on kNN Graph

Wan-Lei Zhao, Jie Yang, Cheng-Hao Deng

Nearest neighbor search is known as a challenging issue that has been studied for several decades. Recently, this issue becomes more and more imminent in viewing that the big data problem arises from various fields. In this paper, a scalable solution based on hill-climbing strategy with the support of k-nearest neighbor graph (kNN) is presented. Two major issues have been considered in the paper. Firstly, an efficient kNN graph construction method based on two means tree is presented. For the nearest neighbor search, an enhanced hill-climbing procedure is proposed, which sees considerable performance boost over original procedure. Furthermore, with the support of inverted indexing derived from residue vector quantization, our method achieves close to 100% recall with high speed efficiency in two state-of-the-art evaluation benchmarks. In addition, a comparative study on both the compressional and traditional nearest neighbor search methods is presented. We show that our method achieves the best trade-off between search quality, efficiency and memory complexity.

LGOct 8, 2016

Boost K-Means

Wan-Lei Zhao, Cheng-Hao Deng, Chong-Wah Ngo

Due to its simplicity and versatility, k-means remains popular since it was proposed three decades ago. The performance of k-means has been enhanced from different perspectives over the years. Unfortunately, a good trade-off between quality and efficiency is hardly reached. In this paper, a novel k-means variant is presented. Different from most of k-means variants, the clustering procedure is driven by an explicit objective function, which is feasible for the whole l2-space. The classic egg-chicken loop in k-means has been simplified to a pure stochastic optimization procedure. The procedure of k-means becomes simpler and converges to a considerably better local optima. The effectiveness of this new variant has been studied extensively in different contexts, such as document clustering, nearest neighbor search and image clustering. Superior performance is observed across different scenarios.