NANov 18, 2017
A comparative study on nonlocal diffusion operators related to the fractional LaplacianSiwei Duo, Hong Wang, Yanzhi Zhang
In this paper, we study four nonlocal diffusion operators, including the fractional Laplacian, spectral fractional Laplacian, regional fractional Laplacian, and peridynamic operator. These operators represent the infinitesimal generators of different stochastic processes, and especially their differences on a bounded domain are significant. We provide extensive numerical experiments to understand and compare their differences. We find that these four operators collapse to the classical Laplace operator as α\to 2. The eigenvalues and eigenfunctions of these four operators are different, and the k-th (for k \in N) eigenvalue of the spectral fractional Laplacian is always larger than those of the fractional Laplacian and regional fractional Laplacian. For any α\in (0, 2), the peridynamic operator can provide a good approximation to the fractional Laplacian, if the horizon size δis sufficiently large. We find that the solution of the peridynamic model converges to that of the fractional Laplacian model at a rate of O(δ^{-α}). In contrast, although the regional fractional Laplacian can be used to approximate the fractional Laplacian as α\to 2, it generally provides inconsistent result from that of the fractional Laplacian if α\ll 2. Moreover, some conjectures are made from our numerical results, which could contribute to the mathematics analysis on these operators.
CLJan 3, 2023
ClusTop: An unsupervised and integrated text clustering and topic extraction frameworkZhongtao Chen, Chenghu Mi, Siwei Duo et al.
Text clustering and topic extraction are two important tasks in text mining. Usually, these two tasks are performed separately. For topic extraction to facilitate clustering, we can first project texts into a topic space and then perform a clustering algorithm to obtain clusters. To promote topic extraction by clustering, we can first obtain clusters with a clustering algorithm and then extract cluster-specific topics. However, this naive strategy ignores the fact that text clustering and topic extraction are strongly correlated and follow a chicken-and-egg relationship. Performing them separately fails to make them mutually benefit each other to achieve the best overall performance. In this paper, we propose an unsupervised text clustering and topic extraction framework (ClusTop) which integrates text clustering and topic extraction into a unified framework and can achieve high-quality clustering result and extract topics from each cluster simultaneously. Our framework includes four components: enhanced language model training, dimensionality reduction, clustering and topic extraction, where the enhanced language model can be viewed as a bridge between clustering and topic extraction. On one hand, it provides text embeddings with a strong cluster structure which facilitates effective text clustering; on the other hand, it pays high attention on the topic related words for topic extraction because of its self-attention architecture. Moreover, the training of enhanced language model is unsupervised. Experiments on two datasets demonstrate the effectiveness of our framework and provide benchmarks for different model combinations in this framework.