| ▲ | random3 11 hours ago | |||||||||||||||||||||||||
I built a flash crawler to index all Flash while at Adobe. It started with Alexa top 1M I think then crawled. This was 2008-2010 I think so we had to do a lot of custom stuff, but we basically crawled then ran a headless Firefox with a custom headless Flash player that dumped a ton of data so also analyzed every flash at runtime and indexed all of that. We built a dedicated cluster in a colocation center in Bucharest to handle all of this. Had issues with max floor weights and what not. Then had to upgrade the RAM on on the cluster. No remote hands. Every operation was a trip to a really cold place. Used a lot of early stage stuff like Nutch, Hadoop, HBase etc. Everything was then processed and dumped to an SQL database with a nice UI on top. It took a few weeks to set it up, then we passed it to a team of interns that built the SQL database and UI on top. They learned a ton of stuff. Some are now in the Bay Area. The tool uncovered a ton of security issues. It was fun building it. I wonder if Adobe kept the data. It could be useful and/or good donation for the Computer History Museum. | ||||||||||||||||||||||||||
| ▲ | adithyassekhar 10 hours ago | parent | next [-] | |||||||||||||||||||||||||
Thanks for sharing. It's stories like these I've read since childhood that got me into this. Those little adventures into remote places to work on some computers. This was my version of Indiana jones. But everyone's in an AWS world right now. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | mmooss 8 hours ago | parent | prev [-] | |||||||||||||||||||||||||
Very interesting. What was the objective? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||