Remix.run Logo
dekhn 5 hours ago

That doc predates 2012 significantly.

From what I've been able to glean, it was basically created in the first few years Jeff worked at Google, on indexing and serving for the original search engine. For example, the comparison of cache, RAM, and disk: determined whether data was stored in RAM (the index, used for retrieval) or disk (the documents, typically not used in retrieval, but used in scoring). Similarly, the comparison of California-Netherlands time- I believe Google's first international data cetner was in NL and they needed to make decisions about copying over the entire index in bulk versus serving backend queries in the US with frontends in the NL.

The numbers were always going out of date; for example, the arrival of flash drives changed disk latency significantly. I remember Jeff came to me one day and said he'd invented a compression algorithm for genomic data "so it can be served from flash" (he thought it would be wasteful to use precious flash space on uncompressed genomic data).