Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.3906/elk-1212-109
Abstract
In recent years, there have been significant advances in communication technology, but speech signals still suffer from low perceived quality caused by bandwidth limitations of telephone networks. The bandwidth extension (BWE) approach adds high-frequency components of the speech signal to band-limited telephone speech and increases speech perception significantly. In this work, we develop a new method for representation of vocal tract filter coefficients using log of filter bank energy (LFBE) parameters as an alternative for mel-frequency cepstral coefficients (MFCCs). This approach is based on a strong correlation between the spectral components of low- and high-band spectrums. Furthermore, the performances of Gaussian mixture model and multilayer perceptron neural network methods for estimation of the high-frequency envelope are evaluated. Objective evaluations of the obtained results indicate that the LFBE feature vectors have better performance than the MFCCs. In addition, findings of the objective evaluations showed that using a neural network, which is not common in BWE, achieves a better performance as compared to the Gaussian mixture model.
Keywords
Bandwidth extension, log spectra domain, narrowband speech, neural network, wideband speech
First Page
433
Last Page
446
Recommended Citation
POURMOHAMMADI, SARA; VALI, MANSOUR; and GHADYANI, MOHSEN
(2015)
"Bandwidth extension of narrowband speech in log spectra domain using neural network,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 23:
No.
2, Article 8.
https://doi.org/10.3906/elk-1212-109
Available at:
https://journals.tubitak.gov.tr/elektrik/vol23/iss2/8
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons