Remix.run Logo
g947o 4 hours ago

I think Rust is great for agents, for a reason that is rarely mentioned: unit tests are in the same file. This means that agents just "know" they should update the tests along with the source.

With other languages, whether it's TypeScript/Go/Python, even if you explicitly ask agents to write/run tests, after a while agents just forget to do that, unless they cause build failures. You have to constantly remind them to do that as the session goes. Never happens with Rust in my experience.

jimbokun 26 minutes ago | parent | next [-]

Even LLMs know they should write tests but hate doing it.

0x3f 4 hours ago | parent | prev | next [-]

You can add a callback to e.g. Claude to guarantee it does a cargo check and test.

unshavedyak 4 hours ago | parent [-]

Fwiw i used to do this (and with lints) - it was the only way to make Claude consistent in the early days when i first started using it (~August 2025).

For many months now though, Claude is nearly consistent with both calling test and check/clippy. Perhaps this is due to my global memory file, not sure to be honest.

What i do know, is that i never use those hooks, i have them disabled atm. Why? Because the benefit is almost nonexistent as i mentioned, and the cost is at times, quite high. It means i cannot work on a project piecemeal, aka "only focus on this file, it will not compile and that's okay", and instead forces claude to make complete edits which may be harder to review. Worst of all, i have seen it get into a loop and be unable to exit. Eg a test fails and claude says "that failure is not due to my changes" or w/e, and it just does that.. forever, on loop. Burns 100% of the daily tokens pretty quick if unmonitored.

Fwiw i've not looked to see if there's an alternate way to write hooks. It might be worth having the hook only suggest, rather than forcing claude. Alternatively, maybe i could spawn a subagent to review if stopping claude makes sense.. hmm.

0x3f 8 minutes ago | parent [-]

I find this doesn't work automatically for me because the projects I'm on have a lot of conditional compilation feature flags that it doesn't quite understand how to cargo check properly, unless I tell it.

Maybe for your case you could create a /maybe-check command, and run that in the hook? Then specify the conditions under which a check/test is needed in there.

wakawaka28 2 hours ago | parent | prev [-]

Unit tests in the same file wastes context and makes the whole thing hard to navigate for humans and machines alike.

jimbokun 26 minutes ago | parent | next [-]

It’s about the best possible documentation.

dnautics 2 hours ago | parent | prev | next [-]

nah, the agents jump around files anyways.

J_Shelby_J 2 hours ago | parent | prev [-]

I’ve been doing the least amount of unit tests possible and doing debug asserts instead.

0x3f 7 minutes ago | parent [-]

Normally I would put as many invariants in the types as possible, then tests cover the rest. I'm curious how you do this/what you use it for though. Would be cool if you had any examples.