Remix.run Logo
New Apple Study Shows LLMs Can Tell What You're Doing from Audio and Motion Data(9to5mac.com)
50 points by andrewrn 5 hours ago | 19 comments
chasing0entropy 5 hours ago | parent | next [-]

If you're interested in this concept, it's not new and the alarm has been sounded since the android Facebook app required motion sensor permissions in android 4.

https://par.nsf.gov/servlets/purl/10028982

https://arxiv.org/pdf/2109.13834.pdf

thewebguyd 3 hours ago | parent | next [-]

> alarm has been sounded since the android Facebook app required motion sensor permissions in android 4.

Serves as a useful reminder that just because someone may not care if these companies collect this data now, they are storing it, sometimes indefinitely, and as technology advances, will be able to make more use of it than they were at the time you agreed to share it with them.

It's like all the ransomware gangs hoarding the encrypted data they stole, waiting for a quantum computing breakthrough to be able to decrypt it.

Not sure what to do about it, if anything, but the average person is severely under-equipped and undereducated to deal with this and protect themselves from the levels of surveillance that are soon to come.

hexbin010 an hour ago | parent | prev [-]

I tried denying the Sensor perm to most apps and my battery tanked. My guess is there are a few that sit in a busy loop trying to get the data with no handling of the permission not being granted, because it's expected on 99.99999% of devices

rckt 3 hours ago | parent | prev | next [-]

Why do you need LLM to interpret patterns?

drdaeman 2 hours ago | parent | next [-]

> The researchers ran the audio and motion data through smaller models that generated text captions and class predictions, then fed those outputs into different LLMs (Gemini-2.5-pro and Qwen-32B) to see how well they could identify the activity.

Maybe I'm not understanding it, but as I get it, LLMs weren't really important: all they did was further interpreting outputs of a fronting audio-to-text classifier model.

Lerc 2 hours ago | parent | prev | next [-]

The same reason you need transistors to make computers.

You don't need them, but they are one way to do it that people know how to implement.

Identifying patterns is fairly amenable to analytic approaches, interpreting them, less so.

bigyabai 2 hours ago | parent | prev [-]

To ensure you drank your Verification Can of Mountain Dew, of course.

skavi 3 hours ago | parent | prev | next [-]

Maybe the 2026 Apple Watch will be able to auto detect running as reliably as my 2015 Samsung Gear S2. My 2022 Series 8 is certainly not there yet.

frizlab 27 minutes ago | parent [-]

That’s weird, I have perfect running detection on an old(er) (Apple) watch. Detection does start late (but is retroactive, so it’s not an issue).

disambiguation 2 hours ago | parent | prev | next [-]

https://abc7.com/post/student-handcuffed-doritos-bag-mistake...

palmotea 4 hours ago | parent | prev | next [-]

AI will finally allow us to bring 1984's Telescreens into existence, at scale.

godelski 3 hours ago | parent [-]

Doesn't the smartphone already far surpass the Telescreen's capabilities and presence? It does more and we carry them in our pockets.

Do people not realize we're beyond 1984? In 1984 the tech wasn't always listening, rather it had the capacity to. Much of it was about how not knowing meant you'd act as if you were just in case. It was making reference to totalitarian states where you don't know if you can freely talk to your neighbor or if they'd turn you in, where people end up creating a double speak

pramsey an hour ago | parent [-]

In 1984 the idea was there were not enough people to listen to everyone, all the time, but the mere possibility was enough. Of course, for us with AI, things are considerable worse. Also, tele screens were mandatory. We are not there with cell phones in a de jure sense, but certainly there in a de facto sense. Of course, if enough people carry phones, it doesn't matter if a few stragglers don't, they will get caught in the net unless they live as hermits, in which case who cares about them. All the pieces are in place, there is no reason we cannot have a global North Korea.

andrewrn 5 hours ago | parent | prev | next [-]

Something to note here that annoys me about the title is that the LLMs aren't taking in the raw data (LLM's are for text, after all). The raw data is fed through audio and motion models that then produce natural language descriptions, that are then fed to the LLM.

Unrelated: yeah, this article is a little creepy, but damn is it interesting technically.

TZubiri 36 minutes ago | parent | prev | next [-]

The tinfoil interpretatio that LLMs can spy on you is shortsighted and a bit paranoid, it would require LLM providers to actually run a prompt asking what you are doing.

However, any system with a mic, like your cellphone listening for a "Hey Siri" prompt, or your fridge, could theoretically be coupled with an llm on an adhoc basis to get a fuller picture of what's going on.

Pretty cool, if an attacker or govt force with a warrant can get an audio stream they can get some clues although of course not probatory evidence.

gizajob 3 hours ago | parent | prev [-]

Time to ditch the Apple Watch then

macintux 3 hours ago | parent [-]

One more positive interpretation of Apple's research interests here is that devices like the Watch can better differentiate between "the wearer just fell and we should call 911" and "the wearer is playing with their kids".

b00ty4breakfast 2 hours ago | parent | next [-]

nuclear power generation is pretty beneficial, but that doesn't justify the existence of nuclear weapons.

bigyabai 2 hours ago | parent | prev [-]

"Can" does not necessarily mean "will", especially for Apple. I wouldn't be surprised if you're describing a feature slated for release in 2037.