7.2NAMay 20
Persistent-Homology-Guided Topology Scanning of Qualitative Indicators for Acoustic Inverse ScatteringXiaomei Yang, Jiaying Jia, Zhiliang Deng
Qualitative methods such as the linear sampling method and the factorization method reconstruct acoustic scatterers through sampling indicators. In practice, these indicators are gray-scale fields on a prescribed sampling window and a binary obstacle shape is obtained only after thresholding. The choice of threshold is usually empirical and may be unstable when the indicator contains noise-induced artifacts or when the scatterer has nontrivial topology, such as multiple components or holes. This paper proposes a topology-aware postprocessing framework based on persistent homology. Given any normalized qualitative indicator, we scan the persistent homology of its superlevel sets and use the resulting zero- and one-dimensional persistent features to estimate or impose the topology of the unknown scatterer. A topology-guided threshold is then selected by minimizing a Betti-number discrepancy together with mild geometric penalties. The method is indicator-agnostic: it can be applied to the linear sampling indicator, the factorization-method indicator, or a normalized fusion of indicators. The main formulation is single-frequency and therefore remains close to the classical qualitative inverse scattering setting. We present the mathematical construction, an automatic topology detection rule based on persistence lifetimes and lifetime gaps, and a detailed algorithmic protocol for numerical implementation. Numerical tests verify that the proposed method is effective.
CVNov 13, 2025
CLIP4VI-ReID: Learning Modality-shared Representations via CLIP Semantic Bridge for Visible-Infrared Person Re-identificationXiaomei Yang, Xizhan Gao, Sijie Niu et al.
This paper proposes a novel CLIP-driven modality-shared representation learning network named CLIP4VI-ReID for VI-ReID task, which consists of Text Semantic Generation (TSG), Infrared Feature Embedding (IFE), and High-level Semantic Alignment (HSA). Specifically, considering the huge gap in the physical characteristics between natural images and infrared images, the TSG is designed to generate text semantics only for visible images, thereby enabling preliminary visible-text modality alignment. Then, the IFE is proposed to rectify the feature embeddings of infrared images using the generated text semantics. This process injects id-related semantics into the shared image encoder, enhancing its adaptability to the infrared modality. Besides, with text serving as a bridge, it enables indirect visible-infrared modality alignment. Finally, the HSA is established to refine the high-level semantic alignment. This process ensures that the fine-tuned text semantics only contain id-related information, thereby achieving more accurate cross-modal alignment and enhancing the discriminability of the learned modal-shared representations. Extensive experimental results demonstrate that the proposed CLIP4VI-ReID achieves superior performance than other state-of-the-art methods on some widely used VI-ReID datasets.
NAAug 3, 2018
Q-Hermite polynomials chaos approximation of likelihood function based on q-Gaussian prior in Bayesian inversionZhiliang Deng, Xiaomei Yang
In real applications, the construction of prior and acceleration of sampling for posterior are usually two key points of Bayesian inversion algorithm for engineers. In this paper, q-analogy of Gaussian distribution, q-Gaussian distribution, is introduced as the prior of inverse problems. And an acceleration algorithm based on spectral likelihood approximation is discussed. We mainly focus on the convergence of the posterior distribution in the sense of Kullback-Leibler divergence when approximated likelihood function and truncated prior distribution are used. Moreover, the convergence in the sense of total variation and Hellinger metric is obtained. In the end two numerical examples are displayed.
LGJul 10, 2019
Multi-layer Attention Mechanism for Speech Keyword RecognitionRuisen Luo, Tianran Sun, Chen Wang et al.
As an important part of speech recognition technology, automatic speech keyword recognition has been intensively studied in recent years. Such technology becomes especially pivotal under situations with limited infrastructures and computational resources, such as voice command recognition in vehicles and robot interaction. At present, the mainstream methods in automatic speech keyword recognition are based on long short-term memory (LSTM) networks with attention mechanism. However, due to inevitable information losses for the LSTM layer caused during feature extraction, the calculated attention weights are biased. In this paper, a novel approach, namely Multi-layer Attention Mechanism, is proposed to handle the inaccurate attention weights problem. The key idea is that, in addition to the conventional attention mechanism, information of layers prior to feature extraction and LSTM are introduced into attention weights calculations. Therefore, the attention weights are more accurate because the overall model can have more precise and focused areas. We conduct a comprehensive comparison and analysis on the keyword spotting performances on convolution neural network, bi-directional LSTM cyclic neural network, and cyclic neural network with the proposed attention mechanism on Google Speech Command datasets V2 datasets. Experimental results indicate favorable results for the proposed method and demonstrate the validity of the proposed method. The proposed multi-layer attention methods can be useful for other researches related to object spotting.
LGMar 1, 2018
Scalar Quantization as Sparse Least Square OptimizationChen Wang, Xiaomei Yang, Shaomin Fei et al.
Quantization can be used to form new vectors/matrices with shared values close to the original. In recent years, the popularity of scalar quantization for value-sharing applications has been soaring as it has been found huge utilities in reducing the complexity of neural networks. Existing clustering-based quantization techniques, while being well-developed, have multiple drawbacks including the dependency of the random seed, empty or out-of-the-range clusters, and high time complexity for a large number of clusters. To overcome these problems, in this paper, the problem of scalar quantization is examined from a new perspective, namely sparse least square optimization. Specifically, inspired by the property of sparse least square regression, several quantization algorithms based on $l_1$ least square are proposed. In addition, similar schemes with $l_1 + l_2$ and $l_0$ regularization are proposed. Furthermore, to compute quantization results with a given amount of values/clusters, this paper designed an iterative method and a clustering-based method, and both of them are built on sparse least square. The paper shows that the latter method is mathematically equivalent to an improved version of k-means clustering-based quantization algorithm, although the two algorithms originated from different intuitions. The algorithms proposed were tested with three types of data and their computational performances, including information loss, time consumption, and the distribution of the values of the sparse vectors, were compared and analyzed. The paper offers a new perspective to probe the area of quantization, and the algorithms proposed can outperform existing methods especially under some bit-width reduction scenarios, when the required post-quantization resolution (number of values) is not significantly lower than the original number.