학술논문
An Approach to Recognize Speech Using Convolutional Neural Network for the Multilingual Language
Document Type
Conference
Author
Source
2023 Global Conference on Information Technologies and Communications (GCITC) Information Technologies and Communications (GCITC), 2023 Global Conference on. :1-6 Dec, 2023
Subject
Language
Abstract
Automatic Speech Recognition Systems (ASRS) are essential for supporting natural language communication between human and machines. It has gained prominence when Artificial Intelligence (AI) and Machine Learning (ML) were introduced. It allows user to naturally interact with machine and perform hands-free operation. In addition, it is a fundamental technology used in many fields such as Education, Smart Home Automation, Automotive, Aviation, Disable People and so on. In this paper, a Convolutional Neural Network (CNN)-based ASRS is built which models raw speech signals. The speech corpus targeted is our own created database in four languages like Hindi, English, Punjabi and Bengali. The recording is done in different environment by 50 male and native speakers of Hindi and Punjabi language. They were able to speak English and Bengali as well. In addition, the collected raw speech samples are further used to extract features using Mel-frequency Cepstral Coefficient (MFCC), the most widely used technique for feature extraction. Further, the 2D CNN model with 6 layers was designed to recognize the speech samples for each language. The experimental results depict the validation accuracy of 96.29% with 0.174 loss. Hence, a significant performance is demonstrated by the CNN-based model for this comprehensive tonal speech dataset.