Remix.run Logo
yalogin 2 days ago

They nailed it. Consumers don't care about AI, they care about functionality they can use, and care less if it uses AI or not. It's on the OS and apps to figure out the AI part. This is why even though people think Apple is far behind in AI, they are doing it at their own pace. The immediate hardware sales for them did not get impacted by lack of flashy AI announcements. They will slowly get there but they have time. The current froth is all about AI infrastructure not consumer devices.

jorvi 2 days ago | parent | next [-]

The only thing Apple is behind on in the AI race is LLMs.

They've been vastly ahead of everyone else with things like text OCR, image element recognition / extraction, microphone noise suppression, etc.

iPhones have had these features 2-5 years before Android did.

michaelcampbell 4 hours ago | parent | next [-]

> had these features 2-5 years before Android did.

"first" isn't always more important than "best". Apple has historically been ok with not being first, as long as it was either best or very obviously "much better". It always, well, USED TO focus on best. It has lost its way in that lately.

laweijfmvo a day ago | parent | prev | next [-]

Apple’s AI powered image editor (like removing something from the background) is near unusable. Samsung’s is near magic, Google’s seems great. So there’s a big gap here.

m463 4 hours ago | parent | next [-]

> unusable

apple is so hit or miss.

I think the image ocr is great and usable. I can take a picture of a phone number and dial it.

but trying to edit a text field is such a nightmare.

(try to change "this if good" to "this is good" on iphone with your fingers is non-apple cumbersome)

jorvi 8 hours ago | parent | prev | next [-]

That is rather funny because I think Google's and Samsung's AI image actions are completely garbage, butchering things to the point where I'd rather do it manually on my desktop or use prompt editing (which to Google's credit Gemini is fantastic at). Whereas Apple's is flawless in discerning everything within a scene or allowing me to extract single items from within a picture. For example say, a backpack in the background.

adastra22 a day ago | parent | prev | next [-]

That is unrelated to and unmentioned in the post you are responding to.

a day ago | parent | prev | next [-]
[deleted]
FridgeSeal a day ago | parent | prev [-]

Well if I ever used an slop-image-generator, that’d be an issue, but as I don’t, it’s a bit of a non-event!

giancarlostoro 2 days ago | parent | prev | next [-]

TTS is absolutely horrible on iOS. I have nearly driven into a wall when trying to use it whilst driving and it goofs up what I've said terribly. For the love of all things holy, will someone at Apple finally fix text to speech? It feels like they last touched it in 2016. My phone can run offline LLMs and generate images but it can't understand my words.

galleywest200 2 days ago | parent | next [-]

> I have nearly driven into a wall when trying to use it whilst driving and it goofs up what I've said terribly.

People should not be using their phones while driving anyways. My iPhone disables all notifications, except for Find My notifications, while driving. Bluetooth speaker calls are an exception.

wolvoleo 2 days ago | parent | prev [-]

It sounds like you mean STT not TTS there?

giancarlostoro a day ago | parent [-]

You're right, in my rage I typod, its really frustrating, even friends will text me and their text makes no sense, and 2 minutes later "STUPID VOICE TO TEXT" I have a few friends who drive trucks, so they need to be able to use their voice to communicate.

delecti a day ago | parent | next [-]

Better speech transcription is cool, but that feels kinda contrived. Phone calls exist, so do voice messages sent via texting apps, and professional drivers can also just wait a bit to send messages if they really must be text; they're on the job, but if it's really that urgent they can pull over.

jimbokun a day ago | parent [-]

They can also use paper maps instead of GPS.

wolvoleo a day ago | parent | prev [-]

I have to say that OpenAI's Whisper model is excellent. If you could leverage that somehow I think it would really improve. I run it locally myself on an old PC with 3060 card. This way I can run whisper large which is still speedy on a GPU especially with faster-whisper. Added bonus is the language autodetection which is great because I speak 3 languages regularly.

I think there's even better models now but Whisper still works fine for me. And there's a big ecosystem around it.

nomel a day ago | parent [-]

I wonder what the wattage difference is between the iPhone STT and Whisper? How many seconds would the iPhone battery last?

fragmede 2 days ago | parent | prev [-]

Kind of a big "only" though. Siri is still shit and it's been 15 years since initial release.

0x38B 2 days ago | parent | next [-]

When I'm driving and tell Siri, "Call <family member name>", sometimes instead of calling, it says, "To who?", and I can't get it to call no matter what I do.

asdff a day ago | parent | prev [-]

Amazing how its been 15 years and it still can't discern 15 from 50 when you talk to it.

altern8 16 hours ago | parent | prev | next [-]

> did not get impacted by lack of flashy AI announcements

To be fair, they did announce flashy AI features. They just didn't deliver them after people bought the products.

I've been reading about possible class action lawsuits and even the government intervening for false advertisement.

nerdjon 2 days ago | parent | prev | next [-]

All of the reporting about Apple being behind on AI is driving me insane and I hope that what Dell is doing is finally going to be the reversal of this pattern.

The only thing that Apple is really behind on is shoving the word (word?) "AI" in your face at every moment when ML has been silently running in many parts of their platforms well before ChatGPT.

Sure we can argue about Siri all day long and some of that is warranted but even the more advanced voice assistants are still largely used for the basics.

I am just hoping that this bubble pops or the marketing turns around before Apple feels "forced" to do a copilot or recall like disaster.

LLM tech isn't going away and it shouldn't, it has its valid use cases. But we will be much better when it finally goes back into the background like ML always was.

yalogin 2 days ago | parent [-]

Right! Also I don’t think Siri is that important to the overall user experience on the ecosystem. Sure it’s one of the most visible use cases but how many people really care about that? I don’t want to talk out loud to do tasks usually, it’s helpful in some specific scenarios but not the primary use case. The text counterpart of understanding user context on the phone is more important even in the context of llms, and that what plays into the success of their stack going forward

lurking_swe 2 days ago | parent | next [-]

are you really asking why someone would like a much better siri?

- truck drivers that are driving for hours.

- commuters driving to work

- ANYONE with a homepod at home that likes to do things hands free (cooking, dishes, etc).

- ANYONE with airpods in their ears that is not in an awkward social setting (bicycle, walking alone on the sidewalk, on a trail, etc)

every one of these interaction modes benefits from a smart siri.

That’s just the tip of the iceberg. Why can’t I have a siri that can intelligently do multi step actions for me? “siri please add milk and eggs to my Target order. Also let my wife know that i’ll pick up the order on my way home from work. Lastly, we’re hosting some friends for dinner this weekend. I’m thinking Italian. Can you suggest 5 recipes i might like? [siri sends me the recipes ASYNC after a web search]”

All of this is TECHNICALLY possible. There’s no reason apple couldn’t build out, or work with, various retailers to create useful MCP-like integrations into siri. Just omit dangerous or destructive actions and require the user to manually confirm or perform those actions. Having an LLM add/remove items in my cart is not dangerous. Importantly, siri should be able to do some tasks for me in the background. Like on my mac…i’m able to launch Cursor and have it work in agent mode to implement some small feature in my project, while i do something else on my computer. Why must i stare at my phone while siri “thinks” and replies with something stupid lol. Similarly, why can’t my phone draft a reply to an email ASYNC and let me review it later at my leisure? Everything about siri is so synchronous. It sucks.

It’s just soooo sooo bad when you consider how good it could be. I think we’re just conditioned to expect it to suck. It doesn’t need to.

FridgeSeal a day ago | parent | next [-]

> siri please add milk and eggs to my Target order.

Woah woah woah, surely you’re not suggesting that you, a user, should have some agency over how you interact with a store?

No, no, you’re not getting off that easy. They’ll want you to use Terry, the Target-AI, through the target app.

nerdjon 2 days ago | parent | prev [-]

I doubt that anyone is actually suggesting that Siri should not be better, but to me I think the issues with it are very much overblown when it does what I actually ask it to do the vast majority of the time since the reality is most of the time what I actually want to ask it to do are basic things.

I have a several homepods, and it does what I ask it to do. This includes being the hub of all of my home automation.

Yes there are areas it can improve but I think the important question is how much use would those things actually get vs making a cool announcement, a fun party trick, and then never used again.

We have also seen the failures that have been done by trying to treat LLM as a magic box that can just do things for you so while these things are "Technically" possible they are far from being reliable.

SoftTalker a day ago | parent | prev [-]

I've never used Siri. Never even tried it. It's disabled on my phone as much as I've been able to work out how to do.

jay_kyburz a day ago | parent [-]

We have a home pod, we use it a lot for simple things like timers when cooking or playing a particular kind of music. They are simple and dumb, but they have become part of our lives. It's just a hands free way to doing simple things we might do on the phone.

We are looking forward to being able to ask Siri to pipe some speech through to an AI

bluGill 2 days ago | parent | prev | next [-]

Even customers who care about AI (or perhaps should...) have other concerns. With the RAM shortage coming up many customers may choose to do without AI features to save money even though they want it at a lower price.

tecoholic 2 days ago | parent | prev [-]

Nailed it? Maybe close. They still have a keyboard button dedicated to Copoilot. That thing can’t be reconfigured easily.

angulardragon03 a day ago | parent | next [-]

Required for Windows certification nowadays iirc

xgkickt a day ago | parent | prev [-]

Can PowerToys remap it?

khr 11 hours ago | parent | next [-]

Yes, on my Thinkpad I could remap it with Powertoys. It looks like the sibling comments have had issues though.

For me, the Copilot key outputs the chord "Win (Left) + Shift (Left) + F23". I remapped it to "Ctrl (Right)" and it's functioning as it should.

projektfu a day ago | parent | prev | next [-]

I have one laptop with a Copilot key in my business. (I didn't even realize that when I bought it.) It takes the place of a modifier key, I think the menu key. Except it outputs a specific keypress (Ctrl+Shift+F23). So it can't be mapped to anything useful like a modifier key. But you can reassign the meaning of Ctrl+Shift+F23.

SturgeonsLaw a day ago | parent | prev | next [-]

Yep. I installed Claude as a PWA and used Powertoys to remap it to a command that launches it

nottorp a day ago | parent | prev | next [-]

Can it be pulled out?

scblock a day ago | parent | prev [-]

You can sort of remap it on windows, but it's somewhat limited in my experience. It shows up as a keyboard chord rather than a simple button press. I think it's LWin+LShift+F23. I ended up simply disabling it entirely on my gaming laptop. I've been meaning to see if it's easier to make it useful on KDE Plasma desktop but haven't yet (though I did remap the HP Omen button to pull down Yakuake instead).