NMF based Speech Enhancement using CNN

Journal: GRENZE International Journal of Engineering and Technology
Authors: D. Jessintha, P. Bini Palas, Berakhah. F Stanley, R. Jeen Retna Kumar
Volume: 10 Issue: 2
Grenze ID: 01.GIJET.10.2.511 Pages: 560-566

Abstract

Convolutional Neural Networks (CNNs), are adept at learning complex patterns in data and can effectively denoise speech signals by mapping noisy spectrograms to clean ones. NMF, on the other hand, is a signal processing technique that can decompose a spectrogram into additive components, including a noise profile. By estimating the noise profile from the noisy speech spectrogram, NMF enables the separation of noise from the desired signal. Combining deep learning with NMF leverages the strengths of both approaches, allowing for robust and efficient speech enhancement even in challenging noisy environments. First, noisy and clean speech samples are loaded and pre-processed, including trimming or padding to a fixed length and computing magnitude spectrograms. A CNN-based deep learning model is then defined and compiled, with two convolutional layers for feature extraction from input spectrograms followed by dropout layers for regularization. This model is trained using the noisy spectrograms as input and their corresponding clean spectrograms as target output. Meanwhile, NMF is applied to estimate the noise profile from the magnitude spectrogram of the noisy speech signal, separating noise from the signal. Finally, the trained model is used to denoise the spectrogram, and the enhanced audio is reconstructed and saved. This approach combines the strengths of CNNs for learning complex mappings and NMF for noise estimation, resulting in improved speech enhancement performance.

Download Now << BACK

GIJET