학술논문

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

Document Type

Working Paper

Author

de Seyssel, Maureen; D'Avirro, Antony; Williams, Adina; Dupoux, Emmanuel

Source

Subject

Computer Science - Computation and Language
Computer Science - Sound
Electrical Engineering and Systems Science - Audio and Speech Processing

Language

Abstract

We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to two tasks: speech resynthesis and speech-to-speech translation. In both cases, the benchmark evaluates the ability of the model to encode emphasis in the speech input and accurately reproduce it in the output, potentially across a change of speaker and language. As part of the evaluation pipeline, we introduce EmphaClass, a new model that classifies emphasis at the frame or word level.
Comment: Accepted at EMNLP 2024 (Main)

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송