Remix.run Logo
What if writing tests was a joyful experience? (2023)(blog.janestreet.com)
43 points by ryanhn 6 hours ago | 15 comments
mlmonkey 2 hours ago | parent | next [-]

Recently, I have given up on writing unit tests, instead prompting an LLM to write them for me. I just sit back and keep prompting it until it gets it right. Sometimes it goes a little haywire in our Monorepo, but I don't have to accept its changes.

It feels ... strangely empowering.

mayoff 2 hours ago | parent | prev | next [-]

If you’re a Swift programmer, the swift-snapshot-testing package is a great implementation of these ideas.

https://github.com/pointfreeco/swift-snapshot-testing

tomhow 2 hours ago | parent | prev | next [-]

Discussed at the time:

What if writing tests was a joyful experience? - https://news.ycombinator.com/item?id=34350749 - Jan 2023 (122 comments)

deathanatos 3 hours ago | parent | prev | next [-]

> You start writing assert fibonacci(15) == ... and already you’re forced to think. What does fibonacci(15) equal? If you already know, terrific—but what are you meant to do if you don’t?

Um …duh? Get out a calculator. Consult a reference, etc. Otherwise compute the result, and ensure you've done that correctly, ideally as independent of the code under test as possible. A lot of even mathematical stuff has "test vectors"; e.g., the SHA algorithms.

> Here’s how you’d do it with an expect test:

  printf "%d" (fibonacci 15);
  [%expect {||}]
> The %expect block starts out blank precisely because you don’t know what to expect. You let the computer figure it out for you. In our setup, you don’t just get a build failure telling you that you want 610 instead of a blank string. You get a diff showing you the exact change you’d need to make to your file to make this test pass; and with a keybinding you can “accept” that diff. The Emacs buffer you’re in will literally be overwritten in place with the new contents:

…you're kidding me. This is "fix the current state of the function — whether correct or not — as the expected output."

Yeah… no kidding that's easier.

We gloss over errors — "some things just looked incorrect" — well, but how do you know that any differently than fib(10)?

Storment33 2 hours ago | parent | next [-]

It is called snapshot testing, very valid technique. Maybe not best suited to a mathematical function like they have here, but I have found it useful for stuff like compilers asserting on the AST, where it would be a pain to write out and assert on the output and may also change shape.

xenophonf 6 minutes ago | parent [-]

TIL. That looks like a nice way to add tests to legacy code without having to re-create what TDD would have had the developers started that way.

nippoo 2 hours ago | parent | prev | next [-]

A lot of tests are designed as regression prevention. You know the system is working as designed, but what if somebody comes along and changes the Fibonacci function to compute much more efficiently (and, in the process, makes some arithmetic errors?).

mikrl an hour ago | parent | prev [-]

I think “test the function does what it does” is not necessarily the intent here, it’s being able to write tests that fill themselves in and assuming you’ll double check afterwards.

That said, I don’t see how it’s much different to TDD (write the test to fail, write the code to pass the test) aside from automating adding the expected test output.

So I guess it’s TDD that centres the code, not the test…

breatheoften 3 hours ago | parent | prev | next [-]

I really like this style of testing -- code that can be tested this way is also the most fun kind of code to work with and the most likely to behave predictably.

I love determinism and plain old data.

Joel_Mckay 2 hours ago | parent [-]

Could look at high-level constraint modelling languages:

https://www.minizinc.org/

It often bypasses the need to get bogged down in probabilistic markdown syntax =3

https://www.youtube.com/watch?v=X6WHBO_Qc-Q

TacticalCoder 2 hours ago | parent | prev | next [-]

Amazing to see Jane Street uses Emacs. And property-based testing too.

> you don’t just get a build failure telling you that you want 610 instead of a blank string

So I had to scratch my head a bit because I was thinking: "Wait, the whole point is that you don't know whether what you're testing is correct or not, so how can you rely on that as input to your tests!?".

But even though I didn't understand everything they do yet I do see at least a big case where it makes lots of sense. And it happens to be a case where a lot of people see the benefits of test: before refactoring.

> What does fibonacci(15) equal? If you already know, terrific—but what are you meant to do if you don’t?

Yeah a common one is reuse a function in the same language which you believe is correct (you probably haven't proven it to be correct). Another typical one is you reuse a similar function from another language (once again, it's probably not been proven it is correct). But if two implementation differ, you know you have an issue.

> let d = create_marketdata_processor () in > ( Do some preprocessing to define the symbol with id=1 as "APPL" )

Typo. It's AAPL, not APPL. It's correctly used as AAPL later on.

FWIW writing tests better become a joyful experience for we're going to need a lot* of these with all our AI generated code.

zem an hour ago | parent [-]

> And it happens to be a case where a lot of people see the benefits of test: before refactoring.

it's also very nice if you have a test-last working style, that is, develop the code first using some sort of ad hoc testing method, then when you're convinced it's working you add tests both as a final check that the output is what you expect across a lot of different corner cases, and to prevent regressions as you continue development.

o_nate 6 hours ago | parent | prev | next [-]

This is a cool idea. I wish something like this existed for C#.

PretzelPirate 3 hours ago | parent [-]

An Agentic coding tool like Github Copilot will do this for you.

3vidence an hour ago | parent | prev [-]

In my experience the lack of joy or difficulty with tests is almost always that the test environment is usually different enough from the real environment that you end up needing to kind of stretch your code to fit into the test env instead of actually testing what you are interested in.

This doesn't apply to very simple functions but tests on simple functions are the least interesting/ valuable.