•  
  •  
 

Turkish Journal of Electrical Engineering and Computer Sciences

Author ORCID Identifier

AWAIS AHMED: 0000-0002-1514-4643

Abstract

Spoken digit recognition (SDR), a type of supervised automatic speech recognition, is essential for various human-machine interaction applications, including banking operations, dialing systems, price extraction, and airline reservation systems. However, designing an effective SDR system presents several challenges, such as developing labeled audio data, selecting appropriate feature extraction methods, and creating high-performance models. To overcome these challenges, a novel approach for robust spoken digit recognition using an integrated log spectrogram convolutional neural network (ILS-CNN) has been proposed. The proposed work presents an efficient SDR method by taking advantage of a log spectrogram layer directly within the neural network to enhance frequency resolution and improve feature extraction. By embedding the spectrogram calculation within the network, we streamline the preprocessing pipeline and mitigate the discrepancies often introduced by external feature computation. Our ILS-CNN architecture demonstrates significant improvements in recognition accuracy and robustness, particularly in noisy environments, which is crucial for real-world applications. The simulation results demonstrate that the proposed method achieves overall accuracy of 99.3% on noise free FSDD dataset. The proposed ILS-CNN method is also robust for noisy scenarios as it achieves an accuracy of 88.5% even when the signal to noise ratio (SNR) is as low as 0 dB.

DOI

10.55730/1300-0632.4153

Keywords

Spoken digit recognition, automatic speech recognition, convolutional neural network, English spoken digits recognition, speech feature extraction

First Page

706

Last Page

724

Publisher

The Scientific and Technological Research Council of Türkiye (TÜBİTAK)

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS