CMU ARCTIC Corpus

The CMU ARCTIC Dataset

The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US English single speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.

The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databses include US English male (bdl) and female (slt) speakers (both experinced voice talent) as well as other accented speakers.

The 1132 sentence prompt list is available from cmuarctic.data

The distributions include 16KHz waveform and simultaneous EGG signals. Full phoentically labelling was perfromed by the CMU Sphinx using the FestVox based labelling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labelling etc.

License: Permissive, attribution required

Price: Free

URL: http://www.festvox.org/cmu_arctic/

class pyroomacoustics.datasets.cmu_arctic.CMUArcticCorpus(basedir=None, download=False, build=True, **kwargs)

Bases: Dataset

This class will load the CMU ARCTIC corpus in a structure amenable to be processed.

basedir

The directory where the CMU ARCTIC corpus is located/downloaded. By default, this is the current directory.

Type

str, option

info

A dictionary whose keys are the labels of metadata fields attached to the samples. The values are lists of all distinct values the field takes.

Type

dict

sentences

The list of all utterances in the corpus

Type

list of CMUArcticSentence

Parameters
  • basedir (str, optional) – The directory where the CMU ARCTIC corpus is located/downloaded. By default, this is the current directory.

  • download (bool, optional) – If the corpus does not exist, download it.

  • speaker (str or list of str, optional) – A list of the CMU ARCTIC speakers labels. If provided, only those speakers are loaded. By default, all speakers are loaded.

  • sex (str or list of str, optional) – Can be ‘female’ or ‘male’

  • lang (str or list of str, optional) – The language, only ‘English’ is available here

  • accent (str of list of str, optional) – The accent of the speaker

build_corpus(**kwargs)

Build the corpus with some filters (sex, lang, accent, sentence_tag, sentence)

filter(**kwargs)

Filter the corpus and selects samples that match the criterias provided The arguments to the keyword can be 1) a string, 2) a list of strings, 3) a function. There is a match if one of the following is True.

  1. value == attribute

  2. value is a list and attribute in value == True

  3. value is a callable (a function) and value(attribute) == True

class pyroomacoustics.datasets.cmu_arctic.CMUArcticSentence(path, **kwargs)

Bases: AudioSample

Create the sentence object

Parameters
  • path (str) – the path to the audio file

  • **kwargs – metadata as a list of keyword arguments

data

The actual audio signal

Type

array_like

fs

sampling frequency

Type

int

plot(**kwargs)

Plot the spectrogram