Remix clone Hacker News

new | show | ask | jobs Github

	▲	donpark an hour ago
		But I've read somewhere that KV cache for speech-to-speech model explodes in size with each turn which could make on-device full-duplex S2S unusable except for quick chats.
	▲	tmzt 25 minutes ago \| parent [-]
		Gemini Nano is supposedly doing it on device. It looks like something similar should work with Apple GPU and ANE.