CLIRDec 2, 2015

Probabilistic Latent Semantic Analysis (PLSA) untuk Klasifikasi Dokumen Teks Berbahasa Indonesia

arXiv:1512.00576v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses document classification for Indonesian language users, but it is incremental as it uses an existing method on new data.

The paper applies Probabilistic Latent Semantic Analysis (PLSA) with Expectation Maximization for topic modeling to classify Indonesian text documents, reporting accuracy results.

One task that is included in managing documents is how to find substantial information inside. Topic modeling is a technique that has been developed to produce document representation in form of keywords. The keywords will be used in the indexing process and document retrieval as needed by users. In this research, we will discuss specifically about Probabilistic Latent Semantic Analysis (PLSA). It will cover PLSA mechanism which involves Expectation Maximization (EM) as the training algorithm, how to conduct testing, and obtain the accuracy result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes