학술논문
Cochleagram Based Speaker Identification Using Noise Adapted CNN
Document Type
Conference
Author
Source
2021 5th International Conference on Electrical Engineering and Information Communication Technology (ICEEICT) Electrical Engineering and Information Communication Technology (ICEEICT), 2021 5th International Conference on. :1-5 Nov, 2021
Subject
Language
Abstract
The process of recognizing a human based on one’s voice is called speaker identification. Speech signals being susceptible to significant variations, it is a quite challenging task and conventional speaker identification (SID) systems perform poorly under different noisy environments. This study presents a robust speaker identification system based on auditory-inspired features called cochleagram. Cochleagram is generated using a gammatone filterbank having 128 channels from frequency 50 to 8000 Hz. A convolutional neural network (CNN) is trained with a combination of cochleagrams constructed from clean and a fixed noise added over speech samples at a certain signal-to-noise ratio, referred as noise adapted CNN. The proposed model was then tested for different noises at different levels of SNRs. Experimental results showed that the proposed system showed better performance than the existing neurogram based method under noisy conditions particularly at very low SNRs for text-dependent as well as text-independent corpora. The proposed system could be used in an automatic speech recognition (ASR) system as a preprocessor.