Tam Doan Thanh

CLApr 11, 2023

LBMT team at VLSP2022-Abmusu: Hybrid method with text correlation and generative models for Vietnamese multi-document summarization

Tan-Minh Nguyen, Thai-Binh Nguyen, Hoang-Trung Nguyen et al.

Multi-document summarization is challenging because the summaries should not only describe the most important information from all documents but also provide a coherent interpretation of the documents. This paper proposes a method for multi-document summarization based on cluster similarity. In the extractive method we use hybrid model based on a modified version of the PageRank algorithm and a text correlation considerations mechanism. After generating summaries by selecting the most important sentences from each cluster, we apply BARTpho and ViT5 to construct the abstractive models. Both extractive and abstractive approaches were considered in this study. The proposed method achieves competitive results in VLSP 2022 competition.

CLJun 26, 2023

Vietnamese multi-document summary using subgraph selection approach -- VLSP 2022 AbMuSu Shared Task

Huu-Thin Nguyen, Tam Doan Thanh, Cam-Van Thi Nguyen

Document summarization is a task to generate afluent, condensed summary for a document, andkeep important information. A cluster of documents serves as the input for multi-document summarizing (MDS), while the cluster summary serves as the output. In this paper, we focus on transforming the extractive MDS problem into subgraph selection. Approaching the problem in the form of graphs helps to capture simultaneously the relationship between sentences in the same document and between sentences in the same cluster based on exploiting the overall graph structure and selected subgraphs. Experiments have been implemented on the Vietnamese dataset published in VLSP Evaluation Campaign 2022. This model currently results in the top 10 participating teams reported on the ROUGH-2 $F\_1$ measure on the public test set.

Tam Doan Thanh

2 Papers