Remix.run Logo
vlovich123 a day ago

Either your code shouldn’t fail or the apostrophe isn’t a valid case.

In the former, hypothesis and other similar frameworks are deterministic and will replay the failing test on request or remember the failing tests in a file to rerun in the future to catch regressions.

In the latter, you just tell the framework to not generate such values or at least to skip those test cases (better to not generate in terms of testing performance).

reverius42 a day ago | parent [-]

I think what they meant is, "won't Hypothesis sometimes fail to generate input with an apostrophe, thus giving you false confidence that your code can handle apostrophes?"

I think the answer to this is, in practice, it will not fail to generate such input. My understanding is that it's pretty good at mutating input to cover a large amount of surface area with as few as possible examples.

eru a day ago | parent [-]

Hypothesis is pretty good, but it's not magic. There's only so many corner cases it can cover in the 200 (or so) cases per tests it's running by default.

But by default you also start with a new random seed every time you run the tests, so you can build up more confidence over the older tests and older code, even if you haven't done anything specifically to address this problem.

Also, even with Hypothesis you can and should still write specific tests or even just specific generators to cover specific classes of corners cases you are worried about in more detail.

thunky a day ago | parent [-]

> But by default you also start with a new random seed every time you run the tests, so you can build up more confidence over the older tests and older code

Is it common practice to use the same seed and run a ton of tests until you're satisfied it tested it thoroughly?

Because I think I would prefer that. With non-deterministic tests I would always wonder if it's going to fail randomly after the code is already in production.

eru 10 hours ago | parent | next [-]

You can think of property based tests as defining a vast 'universe' of tests. Like 'for all strings S and T, we should have to_upper(S) + to_upper(T) == to_upper(S+T)' or something like that. That defines an infinite set of individual test cases: each choice for S and T gives you a different test case.

Running all of these tests would take too long. So instead we take a finite sample from our universe, and only run these.

> With non-deterministic tests I would always wonder if it's going to fail randomly after the code is already in production.

You could always take the same sample, of course. But that means you only ever explore a very small fraction of that universe. So it's more likely you miss something. Remember: closing your eyes doesn't make the tiger go away.

If there are important cases you want to have checked every time, you can use the @example decorator in Hypothesis, or you can just write a traditional example based test.

kobebrookskC3 10 hours ago | parent | prev | next [-]

> With non-deterministic tests I would always wonder if it's going to fail randomly after the code is already in production.

if you didn't use property-based testing, what are the odds you would've thought of the case?

bluGill a day ago | parent | prev | next [-]

the more the test runs the less likely it is there is an uncovered case left. So your confidence grows. Remember too anything found before release is something a customer won't find.

vlovich123 14 hours ago | parent | prev | next [-]

You should save the seeds so you can reproduce the issue. But you should let the seed float so that you test as many cases as possible over time.

nyrikki 21 hours ago | parent | prev [-]

Saving the seed in the build artifacts/logs has saved a lot of time for me even with tools like faker.

eru 10 hours ago | parent [-]

Yes, you should note the seed you use for a run in the logs, but you should use a new seed each run (unless trying to reproduce a known bug) so you cover more of the search space.