STFT¶
Class for real-time STFT analysis and processing.
- class pyroomacoustics.transform.stft.STFT(N, hop=None, analysis_window=None, synthesis_window=None, channels=1, transform='numpy', streaming=True, precision='double', **kwargs)¶
Bases:
object
A class for STFT processing.
- Parameters:
N (int) – number of samples per frame
hop (int) – hop size
analysis_window (numpy array) – window applied to block before analysis
synthesis_window (numpy array) – window applied to the block before synthesis
channels (int) – number of signals
transform (str, optional) – which FFT package to use: ‘numpy’ (default), ‘pyfftw’, or ‘mkl’
streaming (bool, optional) – whether (True, default) or not (False) to “stitch” samples between repeated calls of ‘analysis’ and ‘synthesis’ if we are receiving a continuous stream of samples.
num_frames (int, optional) –
Number of frames to be processed. If set, this will be strictly enforced as the STFT block will allocate memory accordingly. If not set, there will be no check on the number of frames sent to analysis/process/synthesis
- NOTE:
1) num_frames = 0, corresponds to a “real-time” case in which each input block corresponds to [hop] samples. 2) num_frames > 0, requires [(num_frames-1)*hop + N] samples as the last frame must contain [N] samples.
precision (string, np.float32, np.float64, np.complex64, np.complex128, optional) – How many precision bits to use for the input. If ‘single’/np.float32/np.complex64, 32 bits for real inputs or 64 for complex spectrum. Otherwise, cast to 64 bits for real inputs or 128 for complex spectrum (default).
- analysis(x)¶
- Parameters:
x (2D numpy array, [samples, channels]) – Time-domain signal.
- process(X=None)¶
- Parameters:
X (numpy array) – X can take on multiple shapes: 1) (N,) if it is single channel and only one frame 2) (N,D) if it is multi-channel and only one frame 3) (F,N) if it is single channel but multiple frames 4) (F,N,D) if it is multi-channel and multiple frames
- Returns:
x_r – Reconstructed time-domain signal.
- Return type:
numpy array
- reset()¶
Reset state variables. Necessary after changing or setting the filter or zero padding.
- set_filter(coeff, zb=None, zf=None, freq=False)¶
Set time-domain FIR filter with appropriate zero-padding. Frequency spectrum of the filter is computed and set for the object. There is also a check for sufficient zero-padding.
- Parameters:
coeff (numpy array) – Filter in time domain.
zb (int) – Amount of zero-padding added to back/end of frame.
zf (int) – Amount of zero-padding added to front/beginning of frame.
freq (bool) – Whether or not given coefficients (coeff) are in the frequency domain.
- synthesis(X=None)¶
- Parameters:
X (numpy array of frequency content) – X can take on multiple shapes: 1) (N,) if it is single channel and only one frame 2) (N,D) if it is multi-channel and only one frame 3) (F,N) if it is single channel but multiple frames 4) (F,N,D) if it is multi-channel and multiple frames where: - F is the number of frames - N is the number of frequency bins - D is the number of channels
- Returns:
x_r – Reconstructed time-domain signal.
- Return type:
numpy array
- zero_pad_back(zb)¶
Set zero-padding at end of frame.
- zero_pad_front(zf)¶
Set zero-padding at beginning of frame.
- pyroomacoustics.transform.stft.analysis(x, L, hop, win=None, zp_back=0, zp_front=0)¶
Convenience function for one-shot STFT
- Parameters:
x (array_like, (n_samples) or (n_samples, n_channels)) – input signal
L (int) – frame size
hop (int) – shift size between frames
win (array_like) – the window to apply (default None)
zp_back (int) – zero padding to apply at the end of the frame
zp_front (int) – zero padding to apply at the beginning of the frame
- Returns:
X – The STFT of x
- Return type:
ndarray, (n_frames, n_frequencies) or (n_frames, n_frequencies, n_channels)
- pyroomacoustics.transform.stft.compute_synthesis_window(analysis_window, hop)¶
Computes the optimal synthesis window given an analysis window and hop (frame shift). The procedure is described in
D. Griffin and J. Lim, Signal estimation from modified short-time Fourier transform, IEEE Trans. Acoustics, Speech, and Signal Process., vol. 32, no. 2, pp. 236-243, 1984.
- Parameters:
analysis_window (array_like) – The analysis window
hop (int) – The frame shift
- pyroomacoustics.transform.stft.synthesis(X, L, hop, win=None, zp_back=0, zp_front=0)¶
Convenience function for one-shot inverse STFT
- Parameters:
X (array_like (n_frames, n_frequencies) or (n_frames, n_frequencies, n_channels)) – The data
L (int) – frame size
hop (int) – shift size between frames
win (array_like) – the window to apply (default None)
zp_back (int) – zero padding to apply at the end of the frame
zp_front (int) – zero padding to apply at the beginning of the frame
- Returns:
x – The inverse STFT of X
- Return type:
ndarray, (n_samples) or (n_samples, n_channels)