Remix.run Logo
Johnbot 3 hours ago

A lot of geolocation data on the market is anonymized, following medium-lived unique IDs that aren't able to be mapped to other identifiers. The problem with that is that if you have precise locations, or enough samples that you can apply statistics to find precise locations, in many cases you can de-anonymize the IDs. You can purchase address and resident listings from a number of different data vendors, and by checking where the device returns to at night you can figure its home address. Then if you find information on the residents (work locations, schools, etc.), you see if said device goes where each resident of the home address is likely to go, and you now have a pretty good idea of exactly who the device belongs to.

rockskon 3 hours ago | parent | next [-]

There is no such thing as anonymized location data when you have the location of something where and when they sleep and work.

It's a rhetorical fiction the ad industry tells itself.

Terr_ an hour ago | parent | next [-]

Right, there's probably no other phone in the world that typically stops for hours within 1000 feet of my bed and typically stops on Monday-Friday within 1000 feet of my work-desk.

an hour ago | parent | prev | next [-]
[deleted]
Forgeties79 2 hours ago | parent | prev | next [-]

And with LLM’s now it’s easier than ever to piece the parts together. Companies were doing it before we even knew what LLM’s were capable of.

Edit: It's a rhetorical fiction the ad industry tells us.

2 hours ago | parent | prev | next [-]
[deleted]
2 hours ago | parent | prev [-]
[deleted]
teraflop 2 hours ago | parent | prev | next [-]

We should have learned this lesson 20 years ago when researchers were able to deanonymize a lot of the Netflix Prize dataset, which contained nothing except movie ratings and their associated dates.

https://arxiv.org/abs/cs/0610105

If movie ratings are vulnerable to pattern-matching from noisy external sources, then it should be obvious that location data is enormously more vulnerable.

vovanidze 3 hours ago | parent | prev | next [-]

exactly. calling it 'anonymized' is pure security theater once you have enough data points to map out someones daily routine.

waiting for legislation or eulas to fix this is a lost cause since adtech always finds a loophole. the fix has to be architectural. moving toward stateless proxies that strip device identifiers at the edge before they even hit upstream servers. if the payload never touches a persistent db there is literally nothing to de-anonymize. stateless infra is the only sane way forward

uxhacker 9 minutes ago | parent | next [-]

How is this legal under the GPDR? There is clear examples in the citizenlab document of a user been tracked inside of the EU from outside.

Is there not also a requirement for clean consent? Ie a weather app can’t track your precise location?

microtonal 2 hours ago | parent | prev [-]

To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).

chimeracoder 6 minutes ago | parent | next [-]

> To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).

Even if Google and Apple both want to commit to fighting this, it becomes a game of whack-a-mole, because there are all sorts of different ways to track users that the platforms can't control.

As an easy example: every time you share an Instagram post/video/reel, they generate a unique link that is tracked back to you so they can track your social graph by seeing which users end up viewing that link. (TikTok does the same thing, although they at least make it more obvious by showing that in the UI with "____ shared this video with you").

rolph 2 hours ago | parent | prev | next [-]

im not sure about allowed. perhaps required may be closer.

why would someone include tech that makes people think twice about using the app, unless it is required if you want to "sell" in a particular venue.

if your developing geolocation based apps, location tracking is a core function.

a calender, absolutely does not require location tracking beyond what side of the prime meridian are you on.

nickburns an hour ago | parent [-]

> if your developing geolocation based apps, location tracking is a core function.

But the subsequent sale of that data is not—is the discussion here.

rolph 7 minutes ago | parent [-]

[delayed]

CPLX 2 hours ago | parent | prev [-]

Because we don’t enforce antitrust law in this country and the people that make those decisions profit from the ads.

sroussey 3 hours ago | parent | prev | next [-]

Companies exist that de-anonymize other data brokers data. Lets the other data brokers claim they have anonymized data while end end users get everything.

ImPostingOnHN 2 hours ago | parent [-]

you could probably run a anonymization company at the same time you run a de-anonymization company

gessha 38 minutes ago | parent [-]

Best of both worlds - legal and profitable \s

jandrewrogers 2 hours ago | parent | prev | next [-]

Location and identity are inextricably linked. You can't destroy identity without also destroying location and location is critical for myriad purposes.

The analytic reconstruction of identity from location is far more sophisticated than the scenarios people imagine. You don't need to know where they live to figure out who they are. Every human leaves a fingerprint in space-time.

nickburns 2 hours ago | parent | next [-]

> and location is critical for myriad purposes.

It's not though.

Critical for myriad elective purposes? Sure.

jandrewrogers 2 hours ago | parent [-]

Only if you consider the entire concept of logistics in civilization as "elective".

xphos an hour ago | parent | next [-]

Seems hyperbolic we had logistics that functioned extremely well before we had customer location data for sale on 3rd party sites.

philipallstar 11 minutes ago | parent [-]

If you re-read the comment they didn't say that selling it was intrinsic.

nickburns 2 hours ago | parent | prev | next [-]

I don't follow what you mean by 'logistics in civilization' as that's pretty vague and amorphous.

Could you be more specific with maybe a single example of where my physical geographic location is electronically critical for a purpose that isn't elective/optional/avoidable?

(And I'm not just trying to be obtuse. I think you're touching on at least part of the 'heart' of both this conversation and that of digital ID verification.)

quickthrowman an hour ago | parent | prev [-]

How does tracking the movements of individual humans aid shipping and logistics, other than providing traffic data to freight companies? How did we manage to have global supply chains prior to GPS being invented?

Edit: I assume I am missing a crucial part of logistics that you’re familiar with, genuinely curious.

2 hours ago | parent | prev [-]
[deleted]
ninalanyon 2 hours ago | parent | prev | next [-]

In what sense can the latitude and longitude of my house be called anonymous data?

kube-system 2 hours ago | parent [-]

Ultimately, a map is anonymous data containing lat/lon of everyone's house

Alone, these points are not deanonymizing, it's when there's other data associated.

1121redblackgo 3 hours ago | parent | prev | next [-]

Yep. With side channel/one order of thinking above the laws, its trivial to get around said laws. Need better laws.

malfist 2 hours ago | parent | prev [-]

> A lot of geolocation data on the market is anonymized

A lot isn't good enough.