학술논문

A Cantonese Audio-Visual Emotional Speech (CAVES) dataset

Document Type

Original Paper

Author

Chong, Chee Seng; Davis, Chris; Kim, Jeesun

Source

Behavior Research Methods. 56(5):5264-5278

Subject

Cantonese dataset
Auditory and visual expressions
Emotional speech
Dataset evaluation

Language

English

ISSN

1554-3528

Abstract

We present a Cantonese emotional speech dataset that is suitable for use in research investigating the auditory and visual expression of emotion in tonal languages. This unique dataset consists of auditory and visual recordings of ten native speakers of Cantonese uttering 50 sentences each in the six basic emotions plus neutral (angry, happy, sad, surprise, fear, and disgust). The visual recordings have a full HD resolution of 1920 × 1080 pixels and were recorded at 50 fps. The important features of the dataset are outlined along with the factors considered when compiling the dataset. A validation study of the recorded emotion expressions was conducted in which 15 native Cantonese perceivers completed a forced-choice emotion identification task. The variability of the speakers and the sentences was examined by testing the degree of concordance between the intended and the perceived emotion. We compared these results with those of other emotion perception and evaluation studies that have tested spoken emotions in languages other than Cantonese. The dataset is freely available for research purposes.

Online Access

Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송