pyroomacoustics.denoise.spectral_subtraction module¶
- class pyroomacoustics.denoise.spectral_subtraction.SpectralSub(nfft, db_reduc, lookback, beta, alpha=1)¶
Bases:
object
Here we have a class for performing single channel noise reduction via spectral subtraction. The instantaneous signal energy and noise floor is estimated at each time instance (for each frequency bin) and this is used to compute a gain filter with which to perform spectral subtraction.
For a given frame n, the gain for frequency bin k is given by:
\[\begin{split}G[k, n] = \max \\left \{ \\left ( \dfrac{P[k, n]-\\beta P_N[k, n]}{P[k, n]} \\right )^\\alpha, G_{min} \\right \},\end{split}\]where \(G_{min} = 10^{-(db\_reduc/20)}\) and \(db\_reduc\) is the maximum reduction (in dB) that we are willing to perform for each bin (a high value can actually be detrimental, see below). The instantaneous energy \(P[k,n]\) is computed by simply squaring the frequency amplitude at the bin k. The time-frequency decomposition of the input signal is typically done with the STFT and overlapping frames. The noise estimate \(P_N[k, n]\) for frequency bin k is given by looking back a certain number of frames \(L\) and selecting the bin with the lowest energy:
\[P_N[k, n] = \min_{[n-L, n]} P[k, n]\]This approach works best when the SNR is positive and the noise is rather stationary. An alternative approach for the noise estimate (also in the case of stationary noise) would be to apply a lowpass filter for each frequency bin.
With a large suppression, i.e. large values for \(db\_reduc\), we can observe a typical artefact of such spectral subtraction approaches, namely “musical noise”. Here is nice article about noise reduction and musical noise.
Adjusting the constants \(\\beta\) and \(\\alpha\) also presents a trade-off between suppression and undesirable artefacts, i.e. more noticeable musical noise.
Below is an example of how to use this class to emulate a streaming/online input. A full example can be found here.
# initialize STFT and SpectralSub objects nfft = 512 stft = pra.transform.STFT(nfft, hop=nfft//2, analysis_window=pra.hann(nfft)) scnr = pra.denoise.SpectralSub(nfft, db_reduc=10, lookback=5, beta=20, alpha=3) # apply block-by-block for n in range(num_blocks): # go to frequency domain for noise reduction stft.analysis(mono_noisy) gain_filt = scnr.compute_gain_filter(stft.X) # estimating input convolved with unknown response mono_denoised = stft.synthesis(gain_filt*stft.X)
There also exists a “one-shot” function.
# import or create `noisy_signal` denoised_signal = apply_spectral_sub(noisy_signal, nfft=512, db_reduc=10, lookback=5, beta=20, alpha=3)
- Parameters
nfft (int) – FFT size. Length of gain filter, i.e. the number of frequency bins, is given by
nfft//2+1
.db_reduc (float) – Maximum reduction in dB for each bin.
lookback (int) – How many frames to look back for the noise estimate.
beta (float) – Overestimation factor to “push” the gain filter value (at each frequency) closer to the dB reduction specified by
db_reduc
.alpha (float, optional) – Exponent factor to modify transition behavior towards the dB reduction specified by
db_reduc
. Default is 1.
- compute_gain_filter(X)¶
- Parameters
X (numpy array) – Complex spectrum of length
nfft//2+1
.- Returns
Gain filter to multiply given spectrum with.
- Return type
numpy array
- pyroomacoustics.denoise.spectral_subtraction.apply_spectral_sub(noisy_signal, nfft=512, db_reduc=25, lookback=12, beta=30, alpha=1)¶
One-shot function to apply spectral subtraction approach.
- Parameters
noisy_signal (numpy array) – Real signal in time domain.
nfft (int) – FFT size. Length of gain filter, i.e. the number of frequency bins, is given by
nfft//2+1
.db_reduc (float) – Maximum reduction in dB for each bin.
lookback (int) – How many frames to look back for the noise estimate.
beta (float) – Overestimation factor to “push” the gain filter value (at each frequency) closer to the dB reduction specified by
db_reduc
.alpha (float, optional) – Exponent factor to modify transition behavior towards the dB reduction specified by
db_reduc
. Default is 1.
- Returns
Enhanced/denoised signal.
- Return type
numpy array