Remix.run Logo
wenc 5 hours ago

Right now Alexa+ and Gemini are objectively better.

The best is ChatGPT voice mode. It understands non English words and accents amazingly well, and even though the LLM model isn’t the full fledged one, I can have deep conversations with it for an hour without it missing a beat.

CountHackulus 32 minutes ago | parent | next [-]

Alexa+ has been a massive downgrade for me. It's extremely laggy and constantly misunderstands me, whereas the old one never did. "Set a timer for 20 minutes" used to be instant and just work, I did this the other day and it took 10 seconds to respond and set a timer for 10 minutes.

hamdingers 24 minutes ago | parent [-]

Same here. I can see why LLM-driven voice assistants makes sense to product people in the abstract, but introducing non-deterministic behavior into a device I primarily use to help with timekeeping and control lights is nothing but a regression.

stronglikedan 7 minutes ago | parent | prev | next [-]

Alexa+ is terrible compared to Alexa. It's so bad that I've dusted off my v1 echos cuz they're too old to run Alexa+. Complete shit show that is.

I do like Gemini better than Assistant, even though it's not quite there yet. But that's just a matter of time because they actually designed it from the ground up to be a drop in replacement for Assistant.

HereticLocke 3 hours ago | parent | prev | next [-]

I agree, ChatGPT voice mode is pretty impressive. Almost similar to Samantha in 'Her', laughably.

bobthepanda 2 hours ago | parent [-]

Scarlett Johansson is suing OpenAI, in fact

barumrho 4 hours ago | parent | prev | next [-]

Siri doesn't need to have conversations with you. ChatGPT can do that. But, it should be able to do actions you'd do on your phone.

cachius an hour ago | parent | next [-]

Speech to text should work. I regularly have to manually edit the transcribed input. The more special words the more frequent. Completely disregards the context of the current input, for example, on Hacker news might involve special technical and IT vocabulary.

skeledrew an hour ago | parent | prev [-]

Pretty straight forward on Android at least to wire up a harness that talks to Tasker[0] or another full automation app.

[0] https://tasker.joaoapps.com/

DaiPlusPlus 4 hours ago | parent | prev | next [-]

"objectively better" is a subjective statement :)

My preference, however, is for a voice-control UX just like I get with my Amazon Echo and "classic" Alexa like I have been for the past 10 years I've been using it: I think I can best describe it as a "voice-driven command-line" just like your OS' CLI shell, which makes its interactions predictable, even if it means I need to "know" what commands are valid in a given context. We all need predictability and reliability when it comes to my home-automation integrations.

...but computer interaction with a LLM / transformer-driven / "AI agent" is anything but predictable. When Amazon opted everyone into Alexa+ I agreed to give it a go and see if it really made things better or not - and it did not. I opted-out of Alexa+ and went back to something actually reliable.

thrtythreeforty 4 hours ago | parent | next [-]

Here's a question: I don't understand the gap between these LLM powered voice agents vs CLI coding agents, the latter of which are obviously useful and quite resourceful at getting something done when asked in plain English.

Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."

DaiPlusPlus 3 hours ago | parent [-]

> Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."

Incidentally, a major headline in the news this past week was about a coding-agent that wiped its company's entire system, including backups; which the company's staffers were confident was utterly impossible (as it didn't have any access to that system), and yet somehow, it did[1] (the TL;DR is the agent randomly came across an unprotected God-tier admin API-key/token saved to a personal text-file in a filesystem it had read-access to). If an agent can do that with only read-only access to a company's routine/everyday storage area then there's no way I'm giving it the ability to deactivate my house's fire-alarms and security-cameras via Google Home/Matter/Thread/HomeKit/X10/OhFfsNotAnotherCloudBasedAutomationScheme.

[1] https://www.theregister.com/2026/04/27/cursoropus_agent_snuf...

8note an hour ago | parent [-]

If you are really worried about that, the agent already has that access since itll go find that key anyways.

the HN thread about that case was much more of a "why are you putting your prod keys in random text files" and "the sota in prompt engineering is that putting DONT FUCKING DO THE BAD THING" makes the agent more desperate to get stuff done

putting limits at the harness level would do just fine. one LLM call, one tool call per voice message.

you dont have to give it a ton of turns

redwall_hp 3 hours ago | parent | prev | next [-]

Siri's one job I care about is doing exactly what I want while I'm driving. I need it to check my text messages, take dictation, start phone calls and deal with music. I don't need to have conversations with it, I need deterministic responses to known commands.

wat10000 3 hours ago | parent | prev | next [-]

"Objectively" has become a generic intensifier. It's literally infuriating.

ShyCodeGardener 4 hours ago | parent | prev [-]

Whenever I see one of these comments, it's always from someone that tried it at the start and then gave up because of a bad experience. And many times there are more people commenting back that this was essentially the 1.0 version and that the current 2.0 version is much better. So as someone that uses none of these products (old voice assistants vs. ai ones) it's really hard to evaluate if any of these anecdotes mean anything.

You could have tried Alexa+ at the start when it was shitty compared to plain Alexa, and maybe it's better now. But equally none of the people that comment that it is "amazing" in its current iteration qualify their statements with their experiences comparing and contrasting the old version vs. the new version making them seem either unqualified to make statements based on how much "better" it is than the old version or at worse they are shills (paid or not). The best take is that they are comparing (e.g.) day-one Alexa+ vs. the current Alexa+ without a comparison to the original Alexa.

... which is to say that it really feels like there are no clear conclusions that could be drawn from all of this.

dudeinhawaii an hour ago | parent | next [-]

I'm not an Alexa user myself but I have watched my wife interact with it for around 5years now.

The new Alexa powered by an LLM is objectively better that previous Alexa in a few ways. This much was apparently from day one and has only gotten smoother.

1. It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.

2. Speech recognition seems better (less instances of 5x round trips)

3. Conversational with multi-turn --- my wife can have a back and forth clarifying a topic.

4. Seems to understand intent a bit better. (user asked A so they are probably thinking about B)

Those may seem small but they were a tremendous source of annoyance for her -- and thus for me -- "Alexa is not listening, do something!"

DaiPlusPlus an hour ago | parent [-]

> It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.

...how does that work, exactly? (or rather: what's the context here?); there's no possible way for an Alexa+-powered Amazon Echo to control my AppleTV or interface with VLC on my desktop.

swiftcoder an hour ago | parent [-]

Presumably, FireTV?

circuit10 3 hours ago | parent | prev | next [-]

No matter how good the LLM features are, I just want to turn my lights on and off and check the time. A perfect LLM could maybe perform on par with a simple deterministic command system for these tasks, but not better. All an LLM does is introduce the possibility that a command that worked fine yesterday will randomly not work

Also, one of my first interactions with this Alexa+ thing was “how long is it until 8:45am”, one of only a few commands I use it for to work out how much sleep I’m getting, and it proceeded to ask me what the current time was… I immediately turned it off after that

DaiPlusPlus 4 hours ago | parent | prev | next [-]

> that tried it at the start and then gave up because of a bad experience

I've had enough bad experiences with products that never got better, or just got worse (Exhibit A: Windows 11). Like most primates, I am capable of learning, and I've learned that once a consumer product/service goes bad there's little hope of a turn-around. I accept that you're telling me that it's gotten better, but of the people I know IRL who also use an Echo, none of them have told me that Alexa+ is worth trying, let alone committing to.

Yes, it's on me for not giving Alexa+ a second chance, but I'm not willing to give Alexa+ a second chance because, as a technology product/service customer, I just don't feel respected by the industry I work for (...lol); if Amazon, Microsoft, Google, et al won't respect me, why should I venture outside my comfort-zone for... what benefit, exactly?

cachius an hour ago | parent [-]

The current photos app on Win 11 has accumulated a whopping one gigabyte of - what actually?

_DeadFred_ 2 hours ago | parent | prev [-]

It's not the early 2000s where just messing around and wasting time on this stuff is cool in itself. None of that time wasted turned into much long term apps that stuck with me. Maybe a banking app and a trail running app.

I ruined multiple dinners with timers that didn't work (with a time/labor cost).

I had to get out of bed in the freezing to turn the lights out. It's easy to hit the lights when I go to bed but annoying having the tool fail and getting back out.

Music stuff didn't work well because I used Youtube Music not Spotify.

Those were my 3 use cases for Google voice, and it failed them all enough I just stopped using it all together. Who cares if it works today if in another month they just change something and break it again? They've shown it's not a tool to use for tool things, it's a 'gee wow' thing. I don't need to be impressed. I need not burnt food.

alfiedotwtf 3 hours ago | parent | prev | next [-]

This! I talk to ChatGPT every morning, and will listen and navigate my feeds while I drive, summarises posts, answer my questions. It just works.

virgil_disgr4ce 4 hours ago | parent | prev [-]

I concur that the ChatGPT voice mode is excellent. I can't even think of anything to knock it for other than for whatever reason it never 'hears' my kids, but that's probably because it's not intended to be used in multi-participant chats?

But for one-on-one, it is a really outstanding experience. Especially since they tamped down the way over-the-top humanisms.