Turkish Journal of Electrical Engineering and Computer Sciences




The main aim of this paper is to introduce a new approach to enhance speech signals by exploring the advantages of nonlocal means (NLM) estimation and empirical mode decomposition. NLM, a patch-based denoising method, is extensively used for two-dimensional signals like images. However, its use for one-dimensional signals has been attracting more attention recently. The NLM-based approach is quite useful for removing low-frequency noises based on nonlocal similarities present among samples of the signal. However, there is an issue of under averaging in the high-frequency regions. The temporal and spectral characteristics of the speech signal are changing markedly over time. Thus NLM is conventionally not effective to remove the noise components from the speech signal, unlike image denoising. To address this issue, initially, the speech signal is decomposed into oscillatory components called intrinsic mode functions (IMFs) by using a temporal decomposition technique known as the sifting process. Each IMF represents signal information at a certain scale or frequency band. The IMFs do not have abrupt power spectral changes over time. The decomposed IMFs are processed using NLM estimation based on nonlocal similarities for better speech enhancement. The simulation result shows that the proposed method gives better performance in terms of subjective and objective quality measures. Its performance is evaluated for white, factory, and babble noises at different signal to noise ratios.


Speech enhancement, denoising, nonlocal means, empirical mode decomposition

First Page


Last Page