Remix.run Logo
OtherShrezzing 2 days ago

> The easiest way of solving math problems with an LLM is to make sure that very similar problems are included in the training set. Many of the AI achievements would probably look a lot less miraculous if one could check the training data

I'm fairly certain this phenomenon is responsible for LLM capabilities on GeoGuesser type games. They have unreasonably good performance. For example, being able to identify obscure locations from featureless/foggy pictures of a bench. GeoGuesser's entire dataset, including GPS metadata, is definitely included in all of the frontier model training datasets - so it should be unsurprising that they have excellent performance in that domain.

ACCount36 a day ago | parent | next [-]

People tried VLMs on "closed set" GeoGuessr-type tasks - i.e. non-Street View photos in similar style, not published anywhere.

They still kicked ass.

It seems like those AIs just have an awful lot of location familiarity. They've seen enough tagged photos to be able to pick up on the patterns, and generalize that to kicking ass at GeoGuessr.

YetAnotherNick 2 days ago | parent | prev [-]

> GeoGuesser's entire dataset

No, it is not included, however there must be quite a lot of pictures on internet for most cities.. Geoguesser data is same as Google's street view data and it probably contains billions of 360 degree photos.

suddenlybananas 2 days ago | parent | next [-]

Why do you say it's not included? Why wouldn't they include it.

sebzim4500 a day ago | parent [-]

If every photo in streetview was included in the training data of a multimodal LLM it would be like 99.9999% of the training data/resource costs.

It just isn't plausible that anyone has actually done that. I'm sure some people include a small sample of them, though.

bluefirebrand a day ago | parent | next [-]

Why would every photo in streetview be required in order to have Geoguessr's dataset in the training data?

bee_rider a day ago | parent [-]

I’m pretty sure they are saying that Geoguessr's just pulls directly from Google Streetview. There isn’t a separate Geoguessr dataset, it just pulls from Google’s API (at least that’s what Wikipedia says).

bluefirebrand a day ago | parent [-]

I suspect that Geoguessr's dataset is a subset of Google Streetview, but maybe it really is just pulling everything directly

bee_rider a day ago | parent [-]

My guess would be that they pull directly from street-view, maybe with some extra filtering for interesting locations.

Why bother to create a copy, if it can be avoided, right?

clbrmbr 21 hours ago | parent | prev [-]

Yet.

This is a good rebuttal when someone quips that we “are about to run out of data”. There’s oh so much more, just not in the form of books and blogs.

ivape 2 days ago | parent | prev [-]

I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear. There’s definitely way too much video training data out there for everything.

lutusp a day ago | parent [-]

> I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear.

This is not uncommon. Bears aren't always tearing people apart, that's a movie trope with little connection to reality. Black bears in particular are smart and social enough to befriend their food sources.

But a hungry bear, or a bear with cubs, that's a different story. Even then bears may surprise you. Once in Alaska, a mama bear got me to babysit her cubs while she went fishing -- link: https://arachnoid.com/alaska2018/bears.html .