pyroomacoustics.acoustics module¶

class pyroomacoustics.acoustics.AntoniOctaveFilterBank(base_frequency: float = 125.0, fs: float = 16000, n_fft: int = 512, band_overlap_ratio: float = 0.5, slope: int = 0, third: bool = False)¶

Bases: BaseOctaveFilterBank

This class implements a type of fractional octave filter bank with both perfect reconstruction and energy conservation.

J. Antoni, Orthogonal-like fractional-octave-band filters, J. Acoust. Soc. Am., 127, 2, February 2010

base_freq¶

The center frequency of the first octave band

Type:: float

fs¶

The target sampling frequency

Type:: float

n_bands¶

The number of octave bands needed to cover from base_freq to fs / 2 (i.e. floor(log2(fs / base_freq)))

Type:: int

bands¶

The list of bin boundaries for the octave bands

Type:: list of tuple

centers¶: The list of band centers

Parameters:

base_frequency (float, optional) – The center frequency of the first octave band (default: 125 Hz)
fs (float, optional) – The sampling frequency used (default: 16000 Hz)
n_fft (bool, optional) – The FFT size to use
band_overlap_ratio (float) – The overlap between bands. It should be between 0.0 and 0.5.
slope (int) – A parameter controlling the transition between bands. The larger, the sharper the transition.
third (bool) – If set to True, a third Octave band filter bank is created.

analysis(x, band=None, oversampling=2)¶

Process a signal x through the filter bank

Parameters:

x (ndarray (..., n_samples)) – The input signal
band (int) – The index of the band to transform. If None, all the bands are analyzed and returned.
oversampling (int) – Oversampling of FFT to use (default: 2).

Returns:

The input signal filters through all the bands

Return type:

ndarray (…, n_samples, n_bands)

energy(x, oversampling=2)¶

Computes the per-band energy of the input signal.

Parameters:

x (np.ndarray (..., n_samples)) – The signal to analyze.
oversampling (int, optional) – The oversampling to use in the analysis (default 2).

Returns:

The per-band energy of the input signal.

Return type:

np.ndarray (…, n_bands)

get_bw(n_fft=None)¶: Returns the bandwidth of the bands

synthesis(band_magnitudes, min_phase=False, filter_length=None, oversampling=2)¶

Creates a filter with the desired band amplitudes.

Parameters:

band_magnitudes (np.ndarray) – The band amplitude coefficents (…, n_bands)
min_phase (bool) – The filters are made minimum phase if True.
filter_length (int) – The length of the filters.
oversampling (int) – The oversampling to use in the analysis.

Return type:

The impulse responses with the correct levels (…, n_fft)

wavelet_analysis(x, band=None, oversampling=2)¶

Compute the decomposition proposed by Antoni 2008.

Parameters:

x (ndarray (..., n_samples)) – The input signal
band – The index of the band to transform. If None, all the bands are analyzed and returned.
oversampling (int) – Oversampling of FFT to use (default: 2).

Returns:

signal (list[np.ndarray]) – The coefficients of the input signal filters obtained by time-frequency analysis.
parameters (AntoniOctaveFilterBankParameters) – A data structure that contains the parameters used during the analysis.

wavelet_synthesis(signal: ~typing.List[<MagicMock id='129742445022672'>], parameters: ~pyroomacoustics.acoustics.AntoniOctaveFilterBankParameters) → <MagicMock id='129742461505120'>¶

Given the decomposition of the signal by Antoni 2008, compute the octave band signals.

Paramters¶

coeffs: list[np.ndarray]: A list containing the coefficients corresponsing to every octave band.
parameters: AntoniOctaveFilterBankParameters: The parameters of the analysis filterbank.

returns:: The time domain representation of the octave bands at the original sampling rate.
rtype:: np.ndarray (…, num_samples, num_bands)

class pyroomacoustics.acoustics.AntoniOctaveFilterBankParameters(windows: ~typing.List[<MagicMock id='129742445481088'>], n_fft: int, analyzed_band_indices: ~typing.List[int], bands_lower_bins: <MagicMock id='129742443868528'>, bands_center_bins: <MagicMock id='129742444142496'>, bands_upper_bins: <MagicMock id='129742444193808'>, output_length: int, output_dtype: type, padded_length: int)¶

Bases: object

A data structure to hold the paramters used for the analysis of a signal with the Antoni octave filter bank.

analyzed_band_indices: List[int]¶

bands_center_bins: <MagicMock id='129742444142496'>¶

bands_lower_bins: <MagicMock id='129742443868528'>¶

bands_upper_bins: <MagicMock id='129742444193808'>¶

n_fft: int¶

output_dtype: type¶

output_length: int¶

padded_length: int¶

windows: List[<MagicMock id='129742445481088'>]¶

class pyroomacoustics.acoustics.BaseOctaveFilterBank¶

Bases: object

A base class for octave filter banks.

abstractmethod analysis(x, band=None, **kwargs)¶

abstractmethod energy(x, **kwargs)¶

abstractmethod get_bw()¶

abstractmethod synthesis(coeffs, min_phase=False, **kwargs)¶

class pyroomacoustics.acoustics.OctaveBandsFactory(base_frequency=125.0, fs=16000, n_fft=512, keep_dc=False, min_phase=False)¶

Bases: BaseOctaveFilterBank

A class to process uniformly all properties that are defined on octave bands.

Each property is stored for an octave band.

base_freq¶

The center frequency of the first octave band

Type:: float

fs¶

The target sampling frequency

Type:: float

n_bands¶

The number of octave bands needed to cover from base_freq to fs / 2 (i.e. floor(log2(fs / base_freq)))

Type:: int

bands¶

The list of bin boundaries for the octave bands

Type:: list of tuple

centers¶: The list of band centers

Parameters:

base_frequency (float, optional) – The center frequency of the first octave band (default: 125 Hz)
fs (float, optional) – The sampling frequency used (default: 16000 Hz)
n_fft (bool, optional) – The FFT size to use
keep_dc (bool) – If True, include all the lower frequencies in the first filter
min_phase (bool) – If True, make the filters minimum phase

analysis(x, band=None, mode='same')¶

Process a signal x through the filter bank

Parameters:

x (ndarray (n_samples)) – The input signal
band – The index of the band to transform. If None, all the bands are analyzed and returned.
mode – The mode to use for fftconvolve.

Returns:

The input signal filters through all the bands

Return type:

ndarray (n_samples, n_bands)

energy(x)¶

Computes the per-band energy of the input signal.

Parameters:: x (np.ndarray (..., n_samples)) – The signal to analyze.
Returns:: The per-band energy of the input signal.
Return type:: np.ndarray (…, n_bands)

get_bw()¶: Returns the bandwidth of the bands

synthesis(coeffs, min_phase=False)¶

Creates a filter with the desired band amplitudes.

Parameters:

band_magnitudes (np.ndarray) – The band amplitude coefficents (…, n_bands)
min_phase (bool) – The filters are made minimum phase if True.

Return type:

The impulse responses with the correct levels (…, n_fft)

pyroomacoustics.acoustics.antoni_magnitude_octave_filter_response(n_fft, centers, bands, fs, overlap_ratio, slope)¶: Implementation adapted from https://github.com/pyfar/pyfar/blob/main/pyfar/dsp/filter/fractional_octaves.py#L339 MIT License.

pyroomacoustics.acoustics.bandpass_filterbank(bands, fs=1.0, order=8, output='sos')¶

Create a bank of Butterworth bandpass filters

Parameters:

bands (array_like, shape == (n, 2)) – The list of bands [[flo1, fup1], [flo2, fup2], ...]
fs (float, optional) – Sampling frequency (default 1.)
order (int, optional) – The order of the IIR filters (default: 8)
output ({'ba', 'zpk', 'sos'}) – Type of output: numerator/denominator (‘ba’), pole-zero (‘zpk’), or second-order sections (‘sos’). Default is ‘ba’.

Returns:

b, a (ndarray, ndarray) – Numerator (b) and denominator (a) polynomials of the IIR filter. Only returned if output=’ba’.
z, p, k (ndarray, ndarray, float) – Zeros, poles, and system gain of the IIR filter transfer function. Only returned if output=’zpk’.
sos (ndarray) – Second-order sections representation of the IIR filter. Only returned if output==’sos’.

pyroomacoustics.acoustics.bands_hz2s(bands_hz, Fs, N, transform='dft')¶: Converts bands given in Hertz to samples with respect to a given sampling frequency Fs and a transform size N an optional transform type is used to handle DCT case.

pyroomacoustics.acoustics.binning(S, bands)¶: This function computes the sum of all columns of S in the subbands enumerated in bands

pyroomacoustics.acoustics.cosine_magnitude_octave_filter_response(n_fft, centers, fs, keep_dc=True)¶: Creates the magnitude response of a cosine octave-band filterbank as described in D. Schroeder’s PhD thesis.

pyroomacoustics.acoustics.critical_bands()¶: Compute the Critical bands as defined in the book: Psychoacoustics by Zwicker and Fastl. Table 6.1 p. 159

pyroomacoustics.acoustics.inverse_sabine(rt60, room_dim, c=None)¶

Given the desired reverberation time (RT60, i.e. the time for the energy to drop by 60 dB), the dimensions of a rectangular room (shoebox), and sound speed, computes the energy absorption coefficient and maximum image source order needed. The speed of sound used is the package wide default (in constants).

Parameters:

rt60 (float) – desired RT60 (time it takes to go from full amplitude to 60 db decay) in seconds
room_dim (list of floats) – list of length 2 or 3 of the room side lengths
c (float) – speed of sound

Returns:

absorption (float) – the energy absorption coefficient to be passed to room constructor
max_order (int) – the maximum image source order necessary to achieve the desired RT60

pyroomacoustics.acoustics.invmelscale(b)¶: Converts from melscale to frequency in Hertz according to Huang-Acero-Hon (6.143)

pyroomacoustics.acoustics.magnitude_response_to_minimum_phase(magnitude_response, n_fft, axis=-1, eps=1e-05)¶

Creates a minimum phase filter from its magnitude response following the method proposed here. https://ccrma.stanford.edu/~jos/sasp/Minimum_Phase_Filter_Design.html

Parameters:

magnitude_response (np.ndarray) – The response
n_fft (int) – The FFT size to use
axis (int) – The axis where to make the transformation
eps (float) – A small positive constant to use for numerical stability.

Return type:

The minimum phase impulse response with given magnitude response.

pyroomacoustics.acoustics.melfilterbank(M, N, fs=1, fl=0.0, fh=0.5)¶

Returns a filter bank of triangular filters spaced according to mel scale

We follow Huang-Acera-Hon 6.5.2

Parameters:

M ((int)) – The number of filters in the bank
N ((int)) – The length of the DFT
fs ((float) optional) – The sampling frequency (default 8000)
fl ((float)) – Lowest frequency in filter bank as a fraction of fs (default 0.)
fh ((float)) – Highest frequency in filter bank as a fraction of fs (default 0.5)

Return type:

An M times int(N/2)+1 ndarray that contains one filter per row

pyroomacoustics.acoustics.melscale(f)¶: Converts f (in Hertz) to the melscale defined according to Huang-Acero-Hon (2.6)

pyroomacoustics.acoustics.mfcc(x, L=128, hop=64, M=14, fs=8000, fl=0.0, fh=0.5)¶

Computes the Mel-Frequency Cepstrum Coefficients (MFCC) according to the description by Huang-Acera-Hon 6.5.2 (2001) The MFCC are features mimicing the human perception usually used for some learning task.

This function will first split the signal into frames, overlapping or not, and then compute the MFCC for each frame.

Parameters:

x ((nd-array)) – Input signal
L ((int)) – Frame size (default 128)
hop ((int)) – Number of samples to skip between two frames (default 64)
M ((int)) – Number of mel-frequency filters (default 14)
fs ((int)) – Sampling frequency (default 8000)
fl ((float)) – Lowest frequency in filter bank as a fraction of fs (default 0.)
fh ((float)) – Highest frequency in filter bank as a fraction of fs (default 0.5)

Return type:

The MFCC of the input signal

pyroomacoustics.acoustics.octave_bands(fc=1000, third=False, start=0.0, n=8)¶

Create a bank of octave bands

Parameters:

fc (float, optional) – The center frequency
third (bool, optional) – Use third octave bands (default False)
start (float, optional) – Starting frequency for octave bands in Hz (default 0.)
n (int, optional) – Number of frequency bands (default 8)

pyroomacoustics.acoustics.rt60_eyring(S, V, a, m, c)¶

This is the Eyring formula for estimation of the reverberation time.

Parameters:

S – the total surface of the room walls in m^2
V – the volume of the room in m^3
a (float) – the equivalent absorption coefficient sum(a_w * S_w) / S where a_w and S_w are the absorption and surface of wall w, respectively.
m (float) – attenuation constant of air
c (float) – speed of sound in m/s

Returns:

The estimated reverberation time (RT60)

Return type:

float

pyroomacoustics.acoustics.rt60_sabine(S, V, a, m, c)¶

This is the Sabine formula for estimation of the reverberation time.

Parameters:

S – the total surface of the room walls in m^2
V – the volume of the room in m^3
a (float) – the equivalent absorption coefficient sum(a_w * S_w) / S where a_w and S_w are the absorption and surface of wall w, respectively.
m (float) – attenuation constant of air
c (float) – speed of sound in m/s

Returns:

The estimated reverberation time (RT60)

Return type:

float

pyroomacoustics.acoustics module¶

Paramters¶

Pyroomacoustics

Navigation

Related Topics