학술논문

Affect in Multimedia: Benchmarking Violent Scenes Detection

Document Type

Periodical

Author

Constantin, M.G.; Stefan, L.; Ionescu, B.; Demarty, C.; Sjoberg, M.; Schedl, M.; Gravier, G.

Source

IEEE Transactions on Affective Computing IEEE Trans. Affective Comput. Affective Computing, IEEE Transactions on. 13(1):347-366 Jan, 2022

Subject

Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Videos
Motion pictures
Benchmark testing
YouTube
Task analysis
Market research
Machine learning
Violent scenes detection
multi-modal content description
VSD96 data set
benchmarking
literature review

Language

ISSN

1949-3045
2371-9850

Abstract

In this article, we report on the creation of a publicly available, common evaluation framework for Violent Scenes Detection (VSD) in Hollywood and YouTube videos. We propose a robust data set, the VSD96, with more than 96 hours of video of various genres, annotations at different levels of detail (e.g., shot-level, segment-level), annotations of mid-level concepts (e.g., blood, fire), various pre-computed multi-modal descriptors, and over 230 system output results as baselines. This is the most comprehensive data set available to this date tailored to the VSD task and was extensively validated during the MediaEval benchmarking campaigns. Furthermore, we provide an in-depth analysis of the crucial components of VSD algorithms, by reviewing the capabilities and the evolution of existing systems (e.g., overall trends and outliers, the influence of the employed features and fusion techniques, the influence of deep learning approaches). Finally, we discuss the possibility of going beyond state-of-the-art performance via an ad-hoc late fusion approach. Experimentation is carried out on the VSD96 data. We provide the most important lessons learned and gained insights. The increasing number of publications using the VSD96 data underline the importance of the topic. The presented and published resources are a practitioner's guide and also a strong baseline to overcome, which will help researchers for the coming years in analyzing aspects of audio-visual affect and violence detection in movies and videos.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송