Here you can listen to examples of what Balacoon has to offer.
Check out balacoon_tts interactive demo on HuggingFace. Poke around and check naturalness of generated speech.
Neural Text-to-Speech delivers new levels of naturalness as well as new applications. For instance, we can alternate different speech factors, such as speaker identity, expressivity, or recording conditions. In Balacoon, we investigate newly emerged techniques, exploring their potential.
This technology lets to change who is speaking in the audio. We just need a short example of how the target voice sounds. Voice Conversion has been known for decades now. But only recently, it became possible to do it for arbitrary input audio: for speakers unseen at the training stage of a system.
Give it try in HuggingFace demo or download our an Android app: