Gangli Liu

HC
9papers
24citations
Novelty34%
AI Score20

9 Papers

LGDec 5, 2022
Clustering with Neural Network and Index

Gangli Liu

A new model called Clustering with Neural Network and Index (CNNI) is introduced. CNNI uses a Neural Network to cluster data points. Training of the Neural Network mimics supervised learning, with an internal clustering evaluation index acting as the loss function. An experiment is conducted to test the feasibility of the new model, and compared with results of other clustering models like K-means and Gaussian Mixture Model (GMM). The result shows CNNI can work properly for clustering data; CNNI equipped with MMJ-SC, achieves the first parametric (inductive) clustering model that can deal with non-convex shaped (non-flat geometry) data.

LGJul 4, 2022
A New Index for Clustering Evaluation Based on Density Estimation

Gangli Liu

A new index for internal evaluation of clustering is introduced. The index is defined as a mixture of two sub-indices. The first sub-index $ I_a $ is called the Ambiguous Index; the second sub-index $ I_s $ is called the Similarity Index. Calculation of the two sub-indices is based on density estimation to each cluster of a partition of the data. An experiment is conducted to test the performance of the new index, and compared with six other internal clustering evaluation indices -- Calinski-Harabasz index, Silhouette coefficient, Davies-Bouldin index, CDbw, DBCV, and VIASCKDE, on a set of 145 datasets. The result shows the new index significantly improves other internal clustering evaluation indices.

CVJan 15, 2023
Min-Max-Jump distance and its applications

Gangli Liu

We explore three applications of Min-Max-Jump distance (MMJ distance). MMJ-based K-means revises K-means with MMJ distance. MMJ-based Silhouette coefficient revises Silhouette coefficient with MMJ distance. We also tested the Clustering with Neural Network and Index (CNNI) model with MMJ-based Silhouette coefficient. In the last application, we tested using Min-Max-Jump distance for predicting labels of new points, after a clustering analysis of data. Result shows Min-Max-Jump distance achieves good performances in all the three proposed applications. In addition, we devise several algorithms for calculating or estimating the distance.

CLOct 12, 2021
Topic Model Supervised by Understanding Map

Gangli Liu

Inspired by the notion of Center of Mass in physics, an extension called Semantic Center of Mass (SCOM) is proposed, and used to discover the abstract "topic" of a document. The notion is under a framework model called Understanding Map Supervised Topic Model (UM-S-TM). The devising aim of UM-S-TM is to let both the document content and a semantic network -- specifically, Understanding Map -- play a role, in interpreting the meaning of a document. Based on different justifications, three possible methods are devised to discover the SCOM of a document. Some experiments on artificial documents and Understanding Maps are conducted to test their outcomes. In addition, its ability of vectorization of documents and capturing sequential information are tested. We also compared UM-S-TM with probabilistic topic models like Latent Dirichlet Allocation (LDA) and probabilistic Latent Semantic Analysis (pLSA).

HCNov 17, 2017
Understanding Graph and Understanding Map and their Potential Applications

Gangli Liu

Based on the previously proposed concept Understanding Tree, this paper introduces two concepts: Understanding Graph and Understanding Map, and explores their potential applications. Understanding Graph and Understanding Map can be deemed as special cases of mind map, semantic network, or concept map. The two main differences are: Firstly, the data sources for constructing Understanding Map and Understanding Graph are distinctive and simple. Secondly, the relations between concepts in Understanding Graph and Understanding Map are monotonous. Based on their characteristics, applications of them include quantitatively measuring a concept's complexity degree, quantitatively measuring a concept's importance degree in a domain, and computing an optimized learning sequence for comprehending a concept etc. Further study involves evaluating their performances in these applications.

HCDec 22, 2016
Understanding Tree: a tool to estimate one's understanding of knowledge

Gangli Liu

People learn whenever and wherever possible, and whatever they like or encounter--Mathematics, Drama, Art, Languages, Physics, Philosophy, and so on. With the bursting of knowledge, evaluation of one's possession of knowledge becomes increasingly difficult. There are a lot of demands to evaluate one's understanding of a piece of knowledge. Assessment of understanding of knowledge is conventionally through tests or interviews, but they have some limitations such as low-efficiency and not-comprehensive. This paper proposes a method called Understanding Tree to estimate one's understanding of knowledge, by keeping track of his/her learning activities. It overcomes some limitations of traditional methods, hence complements traditional methods.

HCApr 21, 2016
Knowledge model: a method to evaluate an individual's knowledge quantitatively

Gangli Liu

As the quantity of human knowledge increasing rapidly, it is harder and harder to evaluate a knowledge worker's knowledge quantitatively. There are lots of demands for evaluating a knowledge worker's knowledge. For example, accurately finding out a researcher's research concentrations for the last three years; searching for common topics for two scientists with different academic backgrounds; helping a researcher discover his deficiencies on a research field etc. This paper proposes a method named knowledge model to evaluate a knowledge worker's knowledge quantitatively without taking an examination. It records and analyzes an individual's each learning experience, discovering all the involved knowledge points and calculating their shares by analyzing the text learning contents with topic model. It calculates a score for a knowledge point by accumulating the effects of one's all learning experiences about it. A preliminary knowledge evaluating system is developed to testify the practicability of knowledge model.

IRFeb 16, 2016
A Ranking Algorithm for Re-finding

Gangli Liu

Re-finding files from a personal computer is a frequent demand to users. When encountered a difficult re-finding task, people may not recall the attributes used by conventional re-finding methods, such as a file's path, file name, keywords etc., the re-finding would fail. We proposed a method to support difficult re-finding tasks. By asking the user a list of questions about the target, such as a document's pages, author numbers, accumulated reading time, last reading location etc. Then use the user's answers to filter out the target. After the user answered a list of questions about the target file, we evaluate the user's familiar degree about the target file based on the answers. We devise a ranking algorithm which sorts the candidates by comparing the user's familiarity degree about the target and the candidates. We also propose a method to generate re-finding tasks artificially based on the user's own document corpus.

IRJan 27, 2016
A Method to Support Difficult Re-finding Tasks

Gangli Liu, Ling Feng

Re-finding electronic documents from a personal computer is a frequent demand to users. In a simple re-finding task, people can use many methods to retrieve a document, such as navigating directly to the document's folder, searching with a desktop search engine, or checking the Recent Files List. However, when encountering a difficult re-finding task, people usually cannot remember the attributes used by conventional re-finding methods, such as file path, file name, keywords etc., the re-finding would fail. We propose a new method to support difficult re-finding tasks. When a user is reading a document, we collect all kinds of possible memory pieces of the user about the document, such as number of pages, number of images, number of math formulas, cumulative reading time, reading frequency, printing experiences etc. If the user wants to re-find a document later, we use these collected attributes to filter out the target document. To alleviate the user's cognitive burden, we use a question and answer wizard interface and provide recommendations to the answers for the user, the recommendations are generated by analyzing the collected attributes of each document and the user's experiences about them.