Visible to the public Large-scale Affective Content Analysis: Combining Media Content Features and Facial Reactions

TitleLarge-scale Affective Content Analysis: Combining Media Content Features and Facial Reactions
Publication TypeConference Paper
Year of Publication2017
AuthorsMcDuff, D., Soleymani, M.
Conference Name2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017)
Date PublishedMay
PublisherIEEE
ISBN Number978-1-5090-4023-0
KeywordsAU-2, automated facial action measurements, Automated Response Actions, composability, content management, deep visual-sentiment descriptors, encoding, face recognition, facial reactions, facial responses, feature extraction, image classification, large-scale affective content analysis, Media, media clips, media content features, multimodal fusion model, pubcrawl, Resiliency, sentiment analysis, Software, tagging, Videos, visualization
Abstract

We present a novel multimodal fusion model for affective content analysis, combining visual, audio and deep visual-sentiment descriptors from the media content with automated facial action measurements from naturalistic responses to the media. We collected a dataset of 48,867 facial responses to 384 media clips and extracted a rich feature set from the facial responses and media content. The stimulus videos were validated to be informative, inspiring, persuasive, sentimental or amusing. By combining the features, we were able to obtain a classification accuracy of 63% (weighted F1-score: 0.62) for a five-class task. This was a significant improvement over using the media content features alone. By analyzing the feature sets independently, we found that states of informed and persuaded were difficult to differentiate from facial responses alone due to the presence of similar sets of action units in each state (AU 2 occurring frequently in both cases). Facial actions were beneficial in differentiating between amused and informed states whereas media content features alone performed less well due to similarities in the visual and audio make up of the content. We highlight examples of content and reactions from each class. This is the first affective content analysis based on reactions of 10,000s of people.

URLhttps://ieeexplore.ieee.org/document/7961761/
DOI10.1109/FG.2017.49
Citation Keymcduff_large-scale_2017