Remix clone Hacker News

	▲	lcolucci a day ago
		thank you! We have an architecture diagram and some more details in the tech report here: https://lemonslice.com/live/technical-report And yes, exactly. In between each character interaction we need to do speech-to-text, LLM, text-to-speech, and then our video model. All of it happens in a continuously streaming pipeline.