Remix clone Hacker News

new | show | ask | jobs Github

	▲	gagan2020 2 days ago
		It is not good for text to speech (TTS) as well. I am trying it for few days. First of all 1.5B model documentation is not there. 0.5B realtime is shit model. I was converting text, line by line and it was randomly adding music and couldn't handle special characters like "…". I really disappointed with this model to say the least.
	▲	Stagnant 2 days ago \| parent \| next [-]
		The 7B parameter Vibevoice TTS model is still the most impressive local TTS model i've tried. It was pulled by Microsoft a few days after its release due to "abuse potential" but it can be found in various community maintained huggingface repos.
	▲	tjungblut 2 days ago \| parent \| prev [-]
		yep, it seems this was trained on large amount of podcasts with ad jingles or phone call queues with elevator music. I was also pretty disappointed to run the TTS last week.