Remix.run Logo
lcolucci a day ago

thank you! We have an architecture diagram and some more details in the tech report here: https://lemonslice.com/live/technical-report

And yes, exactly. In between each character interaction we need to do speech-to-text, LLM, text-to-speech, and then our video model. All of it happens in a continuously streaming pipeline.