Remix.run Logo
ralegh 5 days ago

This is fine assuming the popular request types don’t change, but arguably if both new versions of matching are sufficiently fast then I would prefer Ken’s long term as the other could become slow again if the distribution of request types changes.

sfilmeyer 5 days ago | parent | next [-]

As a counterpoint, what fraction of the future engineers who will touch the project are likely to be able to competently edit the finite automata based version without introducing bugs and what fraction will be able to competently edit the if statement that checks the particular policy?

mikepurvis 5 days ago | parent [-]

A further question mark is whether any of this has sufficient instrumentation to be able to notice and act on a change of and when it occurs.

andrepd 5 days ago | parent | prev | next [-]

Nonsense. The pre-check can literally be one line (if common_case {fast_path()} else {slow_path()}), and thus enabling or disabling it is dead simple and obvious if the problem changes in the future.

Lines of thinking like that are part of the reason most modern software is so sloooow :)

Rendello 5 days ago | parent | next [-]

This situation where two paths produce the same output but one is optimized is the easiest case in property-based testing, as the property is just:

  normal(x) == optimized(x)
Stratoscope 5 days ago | parent | next [-]

I have sometimes done just this. First I write the simplest possible brute force code, something that any competent programmer can look at and say, "Yes, this may be slow, but it is straightforward and obvious that it will handle all cases."

Then I write the optimized code and use a test like yours to compare the results between the simple code and the optimized code.

One time I needed to write a function to search for a specific pattern in a binary file and change it. So I wrote the brute force code as a first step, the same code that anyone would probably write as a simple solution. It worked the first time, and a couple of people reviewed the code and said "yep, even if it's slow, it is correct."

But this code took more than a second to run!

Of course I thought about optimizing it with Boyer-Moore or the like. Then I went, "Hold on to your horses. This isn't something like a web page load where one second matters. It's part of a build process that only runs a few times a day and already takes several minutes to run. One extra second is nothing!"

In the wise words of Kenny Rogers in The Gambler:

  You got to know when to hold 'em,
  know when to fold 'em
  Know when to walk away
  and know when to run
mark-r 4 days ago | parent | next [-]

Knuth's famous quote about premature optimization captures this perfectly.

"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

scott_w 4 days ago | parent | prev | next [-]

Also true for deciding whether to write code at all! About 15 years ago I was working with a junior who'd spent 3 hours trying to automate some data editing until I said "mate, you can just edit it all by hand in about an hour!"

taeric 4 days ago | parent | next [-]

I had a similar one where a billing mistake had happened in a system and the other devs were trying to find out how to determine all of the bad values. I asked if they had simply looked at the top N that could fit on their screen sorted by value. The incorrect values were the obvious outliers. Only about 4 of them. They had been blocked on identifying them for about an hour.

VBprogrammer 4 days ago | parent | prev | next [-]

Funny enough, with LLMs this trade off may well have flipped. For simple tasks like given a string like X, format it like Y, they work amazingly well.

scott_w 4 days ago | parent [-]

Absolutely. Even back then, trying to script it would sometimes be fine. I've had plenty of situations where I'd try to write a script, give it 30 minutes then say "fuck it, I'm hand-fixing it." The same probably applies to LLMs, you just need to set a cut-off point before it stops being worthwhile.

lettuceconstant 4 days ago | parent | prev [-]

Sure, but I have to obtain my dopamine somehow.

throwaway-kg8d5 4 days ago | parent | prev | next [-]

Very true. I would have gone with

  Every coder knows
  the secret to optimizin’
  is knowing what to throw away
  and knowing what to keep
4 days ago | parent | prev [-]
[deleted]
andrepd 4 days ago | parent | prev [-]

Absolutely! Proptesting is a great but underutilised feature. I use it a lot wherever I'm in a language which makes it easy (and with a mature library for that), like Haskell, Ocaml, or Rust.

Rendello 4 days ago | parent [-]

The simple cases are great demos, and then the complex cases melt your brain! But still worth it a lot of the time.

vlovich123 5 days ago | parent | prev | next [-]

Hyperoptimizing for the fast path today and ignoring that hardware and usage patterns change is the reason modern software is so slooow :)

A more robust strategy would be at least be to check if the rule was the same as the previous one (or a small hash table) so that the system is self-healing.

Ken’s solution is at least robust and by that property I would prefer it since it’s just as fast but doesn’t have any weird tail latencies where the requests out of your cache distribution are as fast as the ones in.

Jean-Papoulos 4 days ago | parent | next [-]

You were shown an example of exactly why this thinking is incorrect but you still insist...

Also, it's trivial to keep Ken's implementation as the slow path. If request patterns change, dig up the new fast path and put the old one in Ken's slow path code. Most of the performance will still come from the initial `if`.

vlovich123 4 days ago | parent [-]

It’s ungenerous to assume I would be against the if statement + Ken’s. But Ken’s approach is critically important and the “if” statement should just be a 1 entry cache instead of being hard coded. Getting this stuff right in a future proofed durable way is actually quite hard even when you notice the opportunity.

necovek 5 days ago | parent | prev [-]

Nobody is hyperoptimizing the fast path today.

Ken's solution was stated to have been slower than the alternative optimization.

akie 4 days ago | parent [-]

Ken's solution optimized the general case, basically everything that doesn't match the if-statement.

ants_everywhere 5 days ago | parent | prev [-]

You can even track the request statistics live and disable the fast path if the distribution of requests changes significantly.

scott_w 4 days ago | parent | prev [-]

I think you missed the point. Ken's version wasn't removed, it was simply prepended with something like:

  if value in most_common_range:
    take_fast_path()
  else:
    use_kens_solution()