Remix clone Hacker News

new | show | ask | jobs Github

	▲	navanchauhan 4 hours ago
		Not affiliated with Sesame, but this is what the realtime models are trying to solve. If you look at NVIDIA’s PersonaPlex release [0], it uses a duplex architecture. It’s based on Moshi [1], which aims to address this problem by allowing the model to listen and generate audio at the same time. [0] https://github.com/NVIDIA/personaplex [1] https://arxiv.org/abs/2410.00037