| ▲ | WeaselsWin 10 hours ago | |
This full duplex spoken thing, it's already for quite a long time being used by the big players when using the whatever "conversation mode" their apps offer, right? Those modes always seemed fast enough to for sure not be going through the STT->LLM->TTS pipeline? | ||
| ▲ | ilaksh 6 hours ago | parent | next [-] | |
There is OpenAI gpt-realtime and Gemini Flash or whatever which are great but they do not seem to be quite the same level of overlapping realistic full duplex as moshi/personaplex. | ||
| ▲ | Tepix 10 hours ago | parent | prev [-] | |
Yes, OpenAI rolled out their advanced voice mode in September 2024. Since then it recognizes your emotions and tone of voice etc. | ||