CMU ARCTIC Corpus¶
The CMU ARCTIC Dataset¶
The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US English single speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.
The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databses include US English male (bdl) and female (slt) speakers (both experinced voice talent) as well as other accented speakers.
The 1132 sentence prompt list is available from cmuarctic.data
The distributions include 16KHz waveform and simultaneous EGG signals. Full phoentically labelling was perfromed by the CMU Sphinx using the FestVox based labelling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labelling etc.
License: Permissive, attribution required
Price: Free
- class pyroomacoustics.datasets.cmu_arctic.CMUArcticCorpus(basedir=None, download=False, build=True, **kwargs)¶
Bases:
Dataset
This class will load the CMU ARCTIC corpus in a structure amenable to be processed.
- basedir¶
The directory where the CMU ARCTIC corpus is located/downloaded. By default, this is the current directory.
- Type
str, option
- info¶
A dictionary whose keys are the labels of metadata fields attached to the samples. The values are lists of all distinct values the field takes.
- Type
dict
- sentences¶
The list of all utterances in the corpus
- Type
list of CMUArcticSentence
- Parameters
basedir (str, optional) – The directory where the CMU ARCTIC corpus is located/downloaded. By default, this is the current directory.
download (bool, optional) – If the corpus does not exist, download it.
speaker (str or list of str, optional) – A list of the CMU ARCTIC speakers labels. If provided, only those speakers are loaded. By default, all speakers are loaded.
sex (str or list of str, optional) – Can be ‘female’ or ‘male’
lang (str or list of str, optional) – The language, only ‘English’ is available here
accent (str of list of str, optional) – The accent of the speaker
- build_corpus(**kwargs)¶
Build the corpus with some filters (sex, lang, accent, sentence_tag, sentence)
- filter(**kwargs)¶
Filter the corpus and selects samples that match the criterias provided The arguments to the keyword can be 1) a string, 2) a list of strings, 3) a function. There is a match if one of the following is True.
value == attribute
value
is a list andattribute in value == True
value
is a callable (a function) andvalue(attribute) == True
- class pyroomacoustics.datasets.cmu_arctic.CMUArcticSentence(path, **kwargs)¶
Bases:
AudioSample
Create the sentence object
- Parameters
path (str) – the path to the audio file
**kwargs – metadata as a list of keyword arguments
- data¶
The actual audio signal
- Type
array_like
- fs¶
sampling frequency
- Type
int
- plot(**kwargs)¶
Plot the spectrogram