MM CLMay 21, 2019

Predicting TED Talk Ratings from Language and Prosody

Md Iftekhar Tanveer, Md Kamrul Hassan, Daniel Gildea, M. Ehsan Hoque

arXiv:1906.03940v12.31 citationsh-index: 36Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of understanding public speaking quality for online content creators and platforms, though it is incremental as it applies existing neural network methods to a new dataset.

The paper tackled predicting viewer ratings of TED Talks using language and prosody features, achieving an average AUC of 0.83 across 14 rating types with a dataset of over 2200 talks and 5.5 million ratings.

We use the largest open repository of public speaking---TED Talks---to predict the ratings of the online viewers. Our dataset contains over 2200 TED Talk transcripts (includes over 200 thousand sentences), audio features and the associated meta information including about 5.5 Million ratings from spontaneous visitors of the website. We propose three neural network architectures and compare with statistical machine learning. Our experiments reveal that it is possible to predict all the 14 different ratings with an average AUC of 0.83 using the transcripts and prosody features only. The dataset and the complete source code is available for further analysis.

View on arXiv PDF

Similar