Remix.run Logo
Show HN: Parkiet – Fine-tune a large TTS model for any language under $100(github.com)
2 points by ph4evers 9 hours ago

A lot of the open-source TTS models are released for English or Chinese and lack support for other languages. I was curious to see if I could train a state-of-the-art text-to-speech (TTS) model for Dutch by using Google's free TPU Research credits. The results are fantastic and on-par with ElevenLabs with just 10,000 hours of data.

I open-sourced the weights, and documented the whole journey, from Torch model conversion, data preparation, JAX training code and inference pipeline. I spent about $300 in egress costs, but it can be as cheap as $100 to train this model (I ran the data collection pipeline on my 5090 Desktop PC as well as fine-tuning Whisper).

Hopefully it can serve as a guide for others that are curious to train these models for other languages (without burning through all the credits trying to fix the pipeline).