Remix.run Logo
m_kos 3 days ago

GitHub of the person who prepared the data. I am curious how much compute was needed for NY. I would love to do it for my metro but I suspect it is way beyond my budget.

https://github.com/yz3440

(The commenters below are right. It is the Maps API, not compute, that I should worry about. Using the free tier, it would have taken the author years to download all tiles. I wish I had their budget!)

LeifCarrotson 3 days ago | parent | next [-]

I would wager the compute for the OCR is cheap. Just get a beefy local desktop PC, if it runs overnight or even takes a week that's fine.

It's the Google Maps API costs that will sink your project if you can't get them waived as art:

https://mapsplatform.google.com/pricing/

Not sure how many panoramas there are in New York or your metro, but if it's over the free tier you're talking thousands of dollars.

daemonologist 3 days ago | parent | prev | next [-]

The linked article mentions that they ingested 8 million panos - even if they're scraping the dynamic viewer that's $30k just in street view API fees (the static image API would probably be at least double that due to the low per-call resolution).

OCR I'd expect to be comparatively cheap, if you weren't in a hurry - a consumer GPU running PaddlePaddle server can do about 4 MP per second. If you spent a few grand on hardware that might work out to 3-6 months of processing, depending on the resolution per pano and size of your model.

swores 3 days ago | parent [-]

Their write up (linked at top of page below main link, and in a comment) says:

> "media artist Yufeng Zhao fed millions of publicly-available panoramas from Google Street View into a computer program that transcribes text within the images (anyone can access these Street View images; you don’t even need a Google account!)."

Maybe they used multiple IPs / devices and didn't want to mention doing something technically naughty to get around Google's free limits, or maybe they somehow didn't hit a limit doing it as a single user? Either way, it doesn't sound like they had to pay if they only mention not needing an account.

(Or maybe they just thought people didn't need to know that they had to pay, and that readers would just want the free access to look up a few images, rather than a whole city's worth?)

Antrikshy 3 days ago | parent [-]

Any possibility this is user-submitted panoramas, and maybe they don't charge for those?

ks2048 3 days ago | parent | prev | next [-]

It says 8 million images. So, 13.2 images/second for one week.

I'm wondering about more the data - did they use Google's API or work with Google to use the data?

puppymaster 3 days ago | parent | prev [-]

i just hashout out the details with claude. apparently it would cost me ~8k USD to retrieve all Taipei street images from gmap api with 3m density. Expensive, but not impossible.