In order to build pronunciation addon:
get the repo
git clone
build docker that manages all the dependencies
# if "build-fe" is specified, balacoon_frontend
# is built from sources. You need special access for it
# which you likely dont have.
bash docker/ [--build-fe]
get pronunciation resources. Adjust those if needed, but don’t forget to share changes as a contribution. In order to promote multi-linguality, a unified phoneme set is used by the balacoon. You can find more information on decisions made in the post. If you want to build a pronunciation generation for a new lexicon, you would need to perform mapping into Balacoon unified phoneme set. Check info on mapping of CMUDict as an example.
# resources are stored as submodules, pick one you need
# from resources dir
git submodule update --init resources/en_us_pronunciation/
launch docker and execute addon creation (includes lexicon packing, FST-based pronunciation generation training). To takes some time to run the training. At the end evaluation on withheld words is executed (if test_words are specified in resourced directory). Accuracy of pronunciation generation heavily depends on the language.
# script is really simple shortcut to start container. Adjust it
# if needed
bash docker/
# check that everything works on a toy lexicon.
learn_to_pronounce --locale en_us --work-dir toy_work_dir \
--resources resources/en_us_pronunciation/toy/
# if everything finishes without errors, time build complete addon.
# check arguments of learn_to_pronounce to learn more on usage.
learn_to_pronounce --locale en_us --out en_us_pronunciation.addon \
--resources resources/en_us_pronunciation/cmudict
learn_to_pronounce contains interactive demos that showcase how to use obtained artifacts.
# generating pronunciation with trained fst:
demo_fst --fst work_dir/pronunciation.fst
# using whole addon: looks up word in lexicon, if not found
# generates pronunciation with FST-based model.
# additionally, can spell words letter-by-letter
demo_pronounce --addon en_us_pronunciation.addon [--spelling]