Remix.run Logo
sandreas 3 hours ago

There are other open source models that produce very good quality:

  F5-TTS
  FishTTS (they changed their license to make money)
I also did some experiments with CoquiTTS, but FishTTS was the most promising in german language samples.

Along with X-Whisper it is possible to use epubs along with narrated audio files to train your favorite narrator's voice instead of only using inference or generated voices. The output quality is really good, however, these cannot be released to the public :-) I'm especially targeting book series with many parts where the publisher has switched narrators or fully stopped releasing later parts.

Audible has also started releasing some of their more underdog books with an ElevenLabs narration in different languages. The AI is still noticable but the quality is pretty impressive.