pyroomacoustics.acoustics module

pyroomacoustics.acoustics.bands_hz2s(bands_hz, Fs, N, transform='dft')

Converts bands given in Hertz to samples with respect to a given sampling frequency Fs and a transform size N an optional transform type is used to handle DCT case.

pyroomacoustics.acoustics.binning(S, bands)

This function computes the sum of all columns of S in the subbands enumerated in bands

pyroomacoustics.acoustics.critical_bands()

Compute the Critical bands as defined in the book: Psychoacoustics by Zwicker and Fastl. Table 6.1 p. 159

pyroomacoustics.acoustics.invmelscale(b)

Converts from melscale to frequency in Hertz according to Huang-Acero-Hon (6.143)

pyroomacoustics.acoustics.melfilterbank(M, N, fs=1, fl=0.0, fh=0.5)

Returns a filter bank of triangular filters spaced according to mel scale

We follow Huang-Acera-Hon 6.5.2

Parameters:
  • M ((int)) – The number of filters in the bank
  • N ((int)) – The length of the DFT
  • fs ((float) optional) – The sampling frequency (default 8000)
  • fl ((float)) – Lowest frequency in filter bank as a fraction of fs (default 0.)
  • fh ((float)) – Highest frequency in filter bank as a fraction of fs (default 0.5)
Returns:

Return type:

An M times int(N/2)+1 ndarray that contains one filter per row

pyroomacoustics.acoustics.melscale(f)

Converts f (in Hertz) to the melscale defined according to Huang-Acero-Hon (2.6)

pyroomacoustics.acoustics.mfcc(x, L=128, hop=64, M=14, fs=8000, fl=0.0, fh=0.5)

Computes the Mel-Frequency Cepstrum Coefficients (MFCC) according to the description by Huang-Acera-Hon 6.5.2 (2001) The MFCC are features mimicing the human perception usually used for some learning task.

This function will first split the signal into frames, overlapping or not, and then compute the MFCC for each frame.

Parameters:
  • x ((nd-array)) – Input signal
  • L ((int)) – Frame size (default 128)
  • hop ((int)) – Number of samples to skip between two frames (default 64)
  • M ((int)) – Number of mel-frequency filters (default 14)
  • fs ((int)) – Sampling frequency (default 8000)
  • fl ((float)) – Lowest frequency in filter bank as a fraction of fs (default 0.)
  • fh ((float)) – Highest frequency in filter bank as a fraction of fs (default 0.5)
Returns:

Return type:

The MFCC of the input signal

pyroomacoustics.acoustics.octave_bands(fc=1000, third=False)

Create a bank of octave bands

Parameters:
  • fc (float, optional) – The center frequency
  • third (bool, optional) – Use third octave bands (default False)