Manual Post-editing of Automatically Transcribed Speeches from the Icelandic Parliament - Althingi
This work addresses improving transcription efficiency for parliamentary proceedings, but it is incremental as it focuses on optimizing an existing system.
The study evaluated an automatic transcription system for Icelandic parliamentary speeches, finding that a 12.6% word edit distance matches manual transcription performance and perfect transcriptions would have a real-time factor of 2.56.
The design objectives for an automatic transcription system are to produce text readable by humans and to minimize the impact on manual post-editing. This study reports on a recognition system used for transcribing speeches in the Icelandic parliament - Althingi. It evaluates the system performance and its effect on manual post-editing. The results are compared against the original manual transcription process. 239 total speeches, consisting of 11 hours and 33 minutes, were processed, both manually and automatically, and the editing process was analysed. The dependence of word edit distance on edit time and the editing real-time factor has been estimated and compared to user evaluations of the transcription system. The main findings show that the word edit distance is positively correlated with edit time and a system achieving a 12.6% edit distance would match the performance of manual transcribers. Producing perfect transcriptions would result in a real-time factor of 2.56. The study also shows that 99% of low error rate speeches received a medium or good grade in subjective evaluations. On the contrary, 21% of high error rate speeches received a bad grade.