I tried talking to Claude today. What a nightmare. It constantly interrupts you. I don’t mind if Claude wants to spend ten seconds thinking about its reply, but at least let ME finish my thought. Without decent turn-taking, the AI seems impolite and it’s just an icky experience. I hope tech like this gets widely distributed soon because there are so many situations in which I would love to talk with a model. If only it worked.

▲

Taikonerd 28 minutes ago | parent | next [-]

Agreed. I tried using Gemini's voice interface in their app. It went like this:

===

ME: "OK, so, I have a question about the economics of medicine. Uh..." [pauses to gather thoughts to ask question]

GEMINI: "Sure! Medical economics is the field of..."

===

And it's aggravated by the fact that all the LLMs love to give you page-long responses before it's your turn to talk again!

▲

MrDunham 2 hours ago | parent | prev | next [-]

I love Anthropic's models but their realtime voice is absolutely terrible. Every time I use it there is at least once that I curse at it for interrupting me.

My main use case for OpenAI/ChatGPT at this point is realtime voice chats.

OpenAI has done a pretty great job w/ realtime (their realtime API is pretty fantastic out of the box... not perfect, but pretty fantastic and dead simple setup). I can have what feels like a legitimate conversation with AI and it's downright magical feeling.

That said, the output is created by OpenAI models so it's... not my favorite.

I sometimes use ChatGPT realtime to think through/work through a problem/idea, have it create a detailed summary, then upload that summary to Claude to let 4.5 Opus rewrite/audit and come up with a better final output.

▲

mavamaarten 12 hours ago | parent | prev | next [-]

Agreed. English is not my native language. And I do speak it well, it's just that sometimes I need a second to think mid-sentence. None of the live chat models out there handle this well. Claude just starts answering before I've even had the chance to finish a sentence.

	▲	Tostino 4 hours ago \| parent [-]
		English is my native language, and I still have this problem all the time with voice models.

▲

sigmoid10 10 hours ago | parent | prev | next [-]

Anthropic doesn't have any realtime multimodal audio models available, they just use STT and TTS models slapped on top of Claude. So they are currently the worst provider if you actually want to use voice communication.

▲

butlike an hour ago | parent | prev [-]

Am I not allowed to cut you off if you're ramble-y and incoherent?

	▲	BizarroLand an hour ago \| parent [-]
		Its rude if you're a human, and entirely unacceptable if you are a computer.