SDJan 5, 2016

An Analysis of Rhythmic Staccato-Vocalization Based on Frequency Demodulation for Laughter Detection in Conversational Meetings

arXiv:1601.00833v11 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for detecting positive laughter in conversational AI or meeting analysis systems.

The paper tackles laughter detection in multiparty conversations by focusing on rhythmic staccato-vocalization, which evokes positive responses, and reports that this novel approach outperforms a standard laughter classification baseline.

Human laugh is able to convey various kinds of meanings in human communications. There exists various kinds of human laugh signal, for example: vocalized laugh and non vocalized laugh. Following the theories of psychology, among all the vocalized laugh type, rhythmic staccato-vocalization significantly evokes the positive responses in the interactions. In this paper we attempt to exploit this observation to detect human laugh occurrences, i.e., the laughter, in multiparty conversations from the AMI meeting corpus. First, we separate the high energy frames from speech, leaving out the low energy frames through power spectral density estimation. We borrow the algorithm of rhythm detection from the area of music analysis to use that on the high energy frames. Finally, we detect rhythmic laugh frames, analyzing the candidate rhythmic frames using statistics. This novel approach for detection of `positive' rhythmic human laughter performs better than the standard laughter classification baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes