Remix.run Logo
Aurornis 8 hours ago

A meta-note on the title since it looks like it’s confusing a lot of commenters: The title is a play on Jeff Dean’s famous “Latency Numbers Every Programmer Should Know” from 2012. It isn’t meant to be interpreted literally. There’s a common theme in CS papers and writing to write titles that play upon themes from past papers. Another common example is the “_____ considered harmful” titles.

shanemhansen 7 hours ago | parent | next [-]

Going to write a real banger of a paper called "latency numbers considered harmful is all you need" and watch my academic cred go through the roof.

AnonymousPlanet 4 hours ago | parent [-]

" ... with an Application to the Entscheidungsproblem"

Kwpolska 7 hours ago | parent | prev | next [-]

This title only works if the numbers are actually useful. Those are not, and there are far too many numbers for this to make sense.

Aurornis 7 hours ago | parent [-]

The title was meant to be taken literally, as in you're supposed to memorize all of these numbers. It was meant as an in-joke reference to the original writing to signal that this document was going to contain timing values for different operations.

I completely understand why it's frustrating or confusing by itself, though.

willseth 7 hours ago | parent | prev | next [-]

Good callout on the paper reference, but this author gives gives every indication that he’s dead serious in the first paragraph. I don’t think commenters are confused.

dekhn 5 hours ago | parent | prev [-]

That doc predates 2012 significantly.

From what I've been able to glean, it was basically created in the first few years Jeff worked at Google, on indexing and serving for the original search engine. For example, the comparison of cache, RAM, and disk: determined whether data was stored in RAM (the index, used for retrieval) or disk (the documents, typically not used in retrieval, but used in scoring). Similarly, the comparison of California-Netherlands time- I believe Google's first international data cetner was in NL and they needed to make decisions about copying over the entire index in bulk versus serving backend queries in the US with frontends in the NL.

The numbers were always going out of date; for example, the arrival of flash drives changed disk latency significantly. I remember Jeff came to me one day and said he'd invented a compression algorithm for genomic data "so it can be served from flash" (he thought it would be wasteful to use precious flash space on uncompressed genomic data).