| ▲ | water-drummer 3 hours ago | |
Gemini live api and grok voice api can make tool calls and they're speech to speech models | ||
| ▲ | d4rkp4ttern 2 hours ago | parent [-] | |
Right, turns out Claude and ChatGPT voice can also do web-search. So I guess behind the scenes there is more than a "pure" voice-voice model being used, i.e. there's probably a rudimentary agent loop with tools + tool-exec interposed. | ||