▲ | lanstin 6 days ago | |
hmmmm. I do like integration tests, but I often tell people the art of modern software is to make reliable systems on top of unreliable components. And the integration tests should 100% include times when the network flakes out and drops 1/2 of replies and corrupts msgs and the like. | ||
▲ | sethammons 5 days ago | parent | next [-] | |
Minor nit. I wouldn't call those failing systems tests integration tests. Unit tests are for validation of error paths. Unit tests can leverage mocks or fakes. Need 3 retires with exponential back off, use unit tests and fakes. Integration tests should use real components. Typically, integration tests are happy path and unit are error paths. Making real components fail and having tests validate failure handling in a more complete environment jumps from integration testing to resilience or chaos testing. Being able to accurately validate backoffs and retries may diminish, but validating intermediate or ending state can be done with artifact monitoring via sinks. There is unit-integration testing which fakes out as little as possible but still fakes out some edges. The difference being that the failures are introduced via fake vs managing actual system components. If you connect to a real db on unit-integration tests, you typically wouldn't kill the db or use Comcast to slow the network artificially. That would be reserved for the next layer in the test pyramid. | ||
▲ | mgh95 6 days ago | parent | prev [-] | |
> I do like integration tests, but I often tell people the art of modern software is to make reliable systems on top of unreliable components. There is a dramatic difference between unreliable in the sense of S3 or other services and unreliable as in "we get different sets of logical outputs when we provide the same input to a LLM". In the first, you can prepare for what are logical outcomes -- network failures, durability loss, etc. In the latter, unless you know the total space of outputs for a LLM you cannot prepare. In the operational sense, LLMs are not a system component, they are a system builder. And a rather poor one, at that. > And the integration tests should 100% include times when the network flakes out and drops 1/2 of replies and corrupts msgs and the like. Yeah, it's not that hard to include that in modern testing. |