Sentence Correction Based on Large-scale Language Modelling
This addresses text loss during generation and transmission, but appears incremental as it builds on existing language modeling techniques.
The paper tackled the problem of missing text restoration by developing a language model that identifies missing words and inserts correct choices, achieving a processing time of 3.6 seconds for 1000 sentences.
With the further development of informatization, more and more data is stored in the form of text. There are some loss of text during their generation and transmission. The paper aims to establish a language model based on the large-scale corpus to complete the restoration of missing text. In this paper, we introduce a novel measurement to find the missing words, and a way of establishing a comprehensive candidate lexicon to insert the correct choice of words. The paper also introduces some effective optimization methods, which largely improve the efficiency of the text restoration and shorten the time of dealing with 1000 sentences into 3.6 seconds. \keywords{ language model, sentence correction, word imputation, parallel optimization