learn_to_pronounce.resources.AbstractProvider
- class learn_to_pronounce.resources.AbstractProvider(resources_dir: str)[source]
shows what should be implemented in resources directory, so it can be used by pronunciation learning recipe.
- __init__(resources_dir: str)[source]
- Parameters
- resources_dir: str
Directory with pronunciation resources (lexicon, phonemes, graphemes, etc)
- abstract get_graphemes() List[str] [source]
Getter for set of graphemes (letters) for given pronunciation resource
- Returns
- graphemes: List[str]
Complete set of letters for pronunciation resource. Can be derived from lexicon.
- abstract get_lexicon(words: Optional[List[str]] = None) PronunciationDictionary [source]
Getter for lexicon - dictionary where pronunciation for the word can be looked up.
- Parameters
- words: List[str] = None
If provided, filters out all the other words from lexicon, keeping only those in the list. Is useful to read lexicon for model training only.
- Returns
- pd: PronunciationDictionary
parsed lexicon as PronunciationDictionary (from pronunciation_generation) object
- abstract get_phonemes() List[str] [source]
Getter for set of phonemes for given pronunciation resource.
- Returns
- phonemes: List[str]
Complete set of phonemes for pronunciation resource. If it’s not among resources, can be derived from lexicon
- abstract get_spelling_lexicon() PronunciationDictionary [source]
Getter for spelling lexicon - dictionary with words being spelled letter by letter, rather than pronounced. Usually spelling lexicon is very simple, up to just pronunciations of separate letters.
- Returns
- sp: PronunciationDictionary
parsed spelling lexicon, similar to
get_lexicon()
- abstract get_test_words() Optional[List[str]] [source]
Getter for list of words from lexicon (
get_lexicon()
) that should be used for evaluation of pronunciation generation. If not specified in resources directory - no evaluation will be carried out.- Returns
- words: List[str]
list of words to be used in evaluation or None
- abstract get_train_words() List[str] [source]
Getter for list of words from lexicon (
get_lexicon()
) that should be used in training of pronunciation generation. If list is not explicitly specified in resources directory, all the words from lexicon should be used.- Returns
- words: List[str]
list of words to be used in training or None