학술논문

SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset

Document Type

Working Paper

Author

Gautam, Sushant; Sarkhoosh, Mehdi Houshmand; Held, Jan; Midoglu, Cise; Cioppa, Anthony; Giancola, Silvio; Thambawita, Vajira; Riegler, Michael A.; Halvorsen, Pål; Shah, Mubarak

Source

Subject

Computer Science - Sound
Computer Science - Information Retrieval
Computer Science - Machine Learning
Computer Science - Multimedia
Electrical Engineering and Systems Science - Audio and Speech Processing
I.2.7
I.7

Language

Abstract

The application of Automatic Speech Recognition (ASR) technology in soccer offers numerous opportunities for sports analytics. Specifically, extracting audio commentaries with ASR provides valuable insights into the events of the game, and opens the door to several downstream applications such as automatic highlight generation. This paper presents SoccerNet-Echoes, an augmentation of the SoccerNet dataset with automatically generated transcriptions of audio commentaries from soccer game broadcasts, enhancing video content with rich layers of textual information derived from the game audio using ASR. These textual commentaries, generated using the Whisper model and translated with Google Translate, extend the usefulness of the SoccerNet dataset in diverse applications such as enhanced action spotting, automatic caption generation, and game summarization. By incorporating textual data alongside visual and auditory content, SoccerNet-Echoes aims to serve as a comprehensive resource for the development of algorithms specialized in capturing the dynamics of soccer games. We detail the methods involved in the curation of this dataset and the integration of ASR. We also highlight the implications of a multimodal approach in sports analytics, and how the enriched dataset can support diverse applications, thus broadening the scope of research and development in the field of sports analytics.

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송