학술논문

Cochleagram Based Speaker Identification Using Noise Adapted CNN
Document Type
Conference
Source
2021 5th International Conference on Electrical Engineering and Information Communication Technology (ICEEICT) Electrical Engineering and Information Communication Technology (ICEEICT), 2021 5th International Conference on. :1-5 Nov, 2021
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Adaptation models
Neural networks
Training data
Filter banks
Distortion
Robustness
Noise measurement
Cochleagram
Convolutional Neural Network (CNN)
Speaker Identification
Gammatone filterbank
Language
Abstract
The process of recognizing a human based on one’s voice is called speaker identification. Speech signals being susceptible to significant variations, it is a quite challenging task and conventional speaker identification (SID) systems perform poorly under different noisy environments. This study presents a robust speaker identification system based on auditory-inspired features called cochleagram. Cochleagram is generated using a gammatone filterbank having 128 channels from frequency 50 to 8000 Hz. A convolutional neural network (CNN) is trained with a combination of cochleagrams constructed from clean and a fixed noise added over speech samples at a certain signal-to-noise ratio, referred as noise adapted CNN. The proposed model was then tested for different noises at different levels of SNRs. Experimental results showed that the proposed system showed better performance than the existing neurogram based method under noisy conditions particularly at very low SNRs for text-dependent as well as text-independent corpora. The proposed system could be used in an automatic speech recognition (ASR) system as a preprocessor.