pyroomacoustics.denoise.spectral_subtraction module

class pyroomacoustics.denoise.spectral_subtraction.SpectralSub(nfft, db_reduc, lookback, beta, alpha=1)

Bases: object

Here we have a class for performing single channel noise reduction via spectral subtraction. The instantaneous signal energy and noise floor is estimated at each time instance (for each frequency bin) and this is used to compute a gain filter with which to perform spectral subtraction.

For a given frame n, the gain for frequency bin k is given by:

\[\begin{split}G[k, n] = \max \\left \{ \\left ( \dfrac{P[k, n]-\\beta P_N[k, n]}{P[k, n]} \\right )^\\alpha, G_{min} \\right \},\end{split}\]

where \(G_{min} = 10^{-(db\_reduc/20)}\) and \(db\_reduc\) is the maximum reduction (in dB) that we are willing to perform for each bin (a high value can actually be detrimental, see below). The instantaneous energy \(P[k,n]\) is computed by simply squaring the frequency amplitude at the bin k. The time-frequency decomposition of the input signal is typically done with the STFT and overlapping frames. The noise estimate \(P_N[k, n]\) for frequency bin k is given by looking back a certain number of frames \(L\) and selecting the bin with the lowest energy:

\[P_N[k, n] = \min_{[n-L, n]} P[k, n]\]

This approach works best when the SNR is positive and the noise is rather stationary. An alternative approach for the noise estimate (also in the case of stationary noise) would be to apply a lowpass filter for each frequency bin.

With a large suppression, i.e. large values for \(db\_reduc\), we can observe a typical artefact of such spectral subtraction approaches, namely “musical noise”. Here is nice article about noise reduction and musical noise.

Adjusting the constants \(\\beta\) and \(\\alpha\) also presents a trade-off between suppression and undesirable artefacts, i.e. more noticeable musical noise.

Below is an example of how to use this class to emulate a streaming/online input. A full example can be found here.

# initialize STFT and SpectralSub objects
nfft = 512
stft = pra.transform.STFT(nfft, hop=nfft//2,
                          analysis_window=pra.hann(nfft))
scnr = pra.denoise.SpectralSub(nfft, db_reduc=10, lookback=5,
                               beta=20, alpha=3)

# apply block-by-block
for n in range(num_blocks):

    # go to frequency domain for noise reduction
    stft.analysis(mono_noisy)
    gain_filt = scnr.compute_gain_filter(stft.X)

    # estimating input convolved with unknown response
    mono_denoised = stft.synthesis(gain_filt*stft.X)

There also exists a “one-shot” function.

# import or create `noisy_signal`
denoised_signal = apply_spectral_sub(noisy_signal, nfft=512,
                                     db_reduc=10, lookback=5,
                                     beta=20, alpha=3)
Parameters
  • nfft (int) – FFT size. Length of gain filter, i.e. the number of frequency bins, is given by nfft//2+1.

  • db_reduc (float) – Maximum reduction in dB for each bin.

  • lookback (int) – How many frames to look back for the noise estimate.

  • beta (float) – Overestimation factor to “push” the gain filter value (at each frequency) closer to the dB reduction specified by db_reduc.

  • alpha (float, optional) – Exponent factor to modify transition behavior towards the dB reduction specified by db_reduc. Default is 1.

compute_gain_filter(X)
Parameters

X (numpy array) – Complex spectrum of length nfft//2+1.

Returns

Gain filter to multiply given spectrum with.

Return type

numpy array

pyroomacoustics.denoise.spectral_subtraction.apply_spectral_sub(noisy_signal, nfft=512, db_reduc=25, lookback=12, beta=30, alpha=1)

One-shot function to apply spectral subtraction approach.

Parameters
  • noisy_signal (numpy array) – Real signal in time domain.

  • nfft (int) – FFT size. Length of gain filter, i.e. the number of frequency bins, is given by nfft//2+1.

  • db_reduc (float) – Maximum reduction in dB for each bin.

  • lookback (int) – How many frames to look back for the noise estimate.

  • beta (float) – Overestimation factor to “push” the gain filter value (at each frequency) closer to the dB reduction specified by db_reduc.

  • alpha (float, optional) – Exponent factor to modify transition behavior towards the dB reduction specified by db_reduc. Default is 1.

Returns

Enhanced/denoised signal.

Return type

numpy array