CVOct 31, 2016

Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features

arXiv:1610.10048v185 citations
Originality Incremental advance
AI Analysis

This work addresses personality trait analysis from videos, which is incremental as it builds on existing competition datasets and methods.

The paper tackles first impressions recognition of the Big Five personality traits from short videos by proposing a bi-modal deep neural network using temporally ordered audio and stochastic visual features, achieving excellent performance on the ChaLearn LAP 2016 dataset.

We propose a novel approach for First Impressions Recognition in terms of the Big Five personality-traits from short videos. The Big Five personality traits is a model to describe human personality using five broad categories: Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness. We train two bi-modal end-to-end deep neural network architectures using temporally ordered audio and novel stochastic visual features from few frames, without over-fitting. We empirically show that the trained models perform exceptionally well, even after training from a small sub-portions of inputs. Our method is evaluated in ChaLearn LAP 2016 Apparent Personality Analysis (APA) competition using ChaLearn LAP APA2016 dataset and achieved excellent performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes