Remix.run Logo
jrockway 5 days ago

The goal of tests is not to prevent you from changing the behavior of your application. The goal is to preserve important behaviors.

If you can't tell if a test is there to preserve existing happenstance behavior, or if it's there to preserve an important behavior, you're slowed way down. Every red test when you add a new feature is a blocker. If the tests are red because you broke something important, great. You saved weeks! If the tests are red because the test was testing something that doesn't matter, not so great. Your afternoon was wasted on a distraction. You can't know in advance whether something is a distraction, so this type of test is a real productivity landmine.

Here's a concrete, if contrived, example. You have a test that starts your app up in a local webserver, and requests /foo, expecting to get the contents of /foo/index.html. One day, you upgrade your web framework, and it has decided to return a 302 Moved redirect to /foo/index.html, so that URLs are always canonical now. Your test fails with "incorrect status code; got 302, want 200". So now what? Do you not apply the version upgrade? Do you rewrite the test to check for a 302 instead of a 200? Do you adjust the test HTTP client to follow redirects silently? The problem here is that you checked for something you didn't care about, the HTTP status, instead of only checking for what you cared about, that "GET /foo" gets you some text you're looking for. In a world where you let the HTTP client follow redirects, like human-piloted HTTP clients, and only checked for what you cared about, you wouldn't have had to debug this to apply the web framework security update. But since you tightened down the screws constraining your application as tightly as possible, you're here debugging this instead of doing something fun.

(The fun doubles when you have to run every test for every commit before merging, and this one failure happened 45 minutes in. Goodbye, the rest of your day!)

HappMacDonald 4 days ago | parent [-]

This example smells a lot like "overfit" in AI training as well.