Griffin-Lim Phase Reconstruction¶
Implementation of the classic phase reconstruction from Griffin and Lim [1]. The input to the algorithm is the magnitude from STFT measurements.
The algorithm works by starting by assigning a (possibly random) initial phase to the measurements, and then iteratively
Reconstruct the time-domain signal
Re-apply STFT
Enforce the known magnitude of the measurements
The implementation supports different types of initialization via the keyword argument ini
.
If omitted, the initial phase is uniformly zero
If
ini="random"
, a random phase is usedIf
ini=A
for anumpy.ndarray
of the same shape as the input magnitude,A / numpy.abs(A)
is used for initialization
Example
import numpy as np
from scipy.io import wavfile
import pyroomacoustics as pra
# We open a speech sample
filename = "examples/input_samples/cmu_arctic_us_axb_a0004.wav"
fs, audio = wavfile.read(filename)
# These are the parameters of the STFT
fft_size = 512
hop = fft_size // 4
win_a = np.hamming(fft_size)
win_s = pra.transform.stft.compute_synthesis_window(win_a, hop)
n_iter = 200
engine = pra.transform.STFT(
fft_size, hop=hop, analysis_window=win_a, synthesis_window=win_s
)
X = engine.analysis(audio)
X_mag = np.abs(X)
X_mag_norm = np.linalg.norm(X_mag) ** 2
# monitor convergence
errors = []
# the callback to track the spectral distance convergence
def cb(epoch, Y, y):
# we measure convergence via spectral distance
Y_2 = engine.analysis(y)
sd = np.linalg.norm(X_mag - np.abs(Y_2)) ** 2 / X_mag_norm
# save in the list every 10 iterations
if epoch % 10 == 0:
errors.append(sd)
pra.phase.griffin_lim(X_mag, hop, win_a, n_iter=n_iter, callback=cb)
plt.semilogy(np.arange(len(errors)) * 10, errors)
plt.show()
References
- pyroomacoustics.phase.gl.griffin_lim(X, hop, analysis_window, fft_size=None, stft_kwargs={}, n_iter=100, ini=None, callback=None)¶
Implementation of the Griffin-Lim phase reconstruction algorithm from STFT magnitude measurements.
- Parameters:
X (array_like, shape (n_frames, n_freq)) – The STFT magnitude measurements
hop (int) – The frame shift of the STFT
analysis_window (array_like, shape (fft_size,)) – The window used for the STFT analysis
fft_size (int, optional) – The FFT size for the STFT, if omitted it is computed from the dimension of
X
stft_kwargs (dict, optional) – Dictionary of extra parameters for the STFT
n_iter (int, optional) – The number of iteration
ini (str or array_like, np.complex, shape (n_frames, n_freq), optional) – The initial value of the phase estimate. If “random”, uses a random guess. If
None
, uses0
phase.callback (func, optional) – A callable taking as argument an int and the reconstructed STFT and time-domain signals