William Hinthorn

CL
3papers
796citations
Novelty50%
AI Score26

3 Papers

CLMar 19, 2020
Enhancing Factual Consistency of Abstractive Summarization

Chenguang Zhu, William Hinthorn, Ruochen Xu et al.

Automatic abstractive summaries are found to often distort or fabricate facts in the article. This inconsistency between summary and original text has seriously impacted its applicability. We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process via graph attention. We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems. Empirical results show that the fact-aware summarization can produce abstractive summaries with higher factual consistency compared with existing systems, and the correction model improves the factual consistency of given summaries via modifying only a few keywords.

CLOct 24, 2019
Combining Acoustics, Content and Interaction Features to Find Hot Spots in Meetings

Dave Makhervaks, William Hinthorn, Dimitrios Dimitriadis et al.

Involvement hot spots have been proposed as a useful concept for meeting analysis and studied off and on for over 15 years. These are regions of meetings that are marked by high participant involvement, as judged by human annotators. However, prior work was either not conducted in a formal machine learning setting, or focused on only a subset of possible meeting features or downstream applications (such as summarization). In this paper we investigate to what extent various acoustic, linguistic and pragmatic aspects of the meetings, both in isolation and jointly, can help detect hot spots. In this context, the openSMILE toolkit is to used to extract features based on acoustic-prosodic cues, BERT word embeddings are used for encoding the lexical content, and a variety of statistics based on speech activity are used to describe the verbal interaction among participants. In experiments on the annotated ICSI meeting corpus, we find that the lexical model is the most informative, with incremental contributions from interaction and acoustic-prosodic model components.

ASMay 3, 2019
Meeting Transcription Using Virtual Microphone Arrays

Takuya Yoshioka, Zhuo Chen, Dimitrios Dimitriadis et al.

We describe a system that generates speaker-annotated transcripts of meetings by using a virtual microphone array, a set of spatially distributed asynchronous recording devices such as laptops and mobile phones. The system is composed of continuous audio stream alignment, blind beamforming, speech recognition, speaker diarization using prior speaker information, and system combination. When utilizing seven input audio streams, our system achieves a word error rate (WER) of 22.3% and comes within 3% of the close-talking microphone WER on the non-overlapping speech segments. The speaker-attributed WER (SAWER) is 26.7%. The relative gains in SAWER over the single-device system are 14.8%, 20.3%, and 22.4% for three, five, and seven microphones, respectively. The presented system achieves a 13.6% diarization error rate when 10% of the speech duration contains more than one speaker. The contribution of each component to the overall performance is also investigated, and we validate the system with experiments on the NIST RT-07 conference meeting test set.