Remix.run Logo
andsoitis 6 days ago

> If I … generate fairly exhaustive unit tests of a trivial function

… then you are not a senior software engineer

triyambakam 6 days ago | parent [-]

Neither are you if that's your understanding of a senior engineer

mgh95 6 days ago | parent | next [-]

I think the parent commentors point was that it is nearly trivial to generate variations on unit tests in most (if not all) unit test frameworks. For example:

Java: https://docs.parasoft.com/display/JTEST20232/Creating+a+Para...

C# (nunit, but xunit has this too): https://docs.nunit.org/articles/nunit/technical-notes/usage/...

Python: https://docs.pytest.org/en/stable/example/parametrize.html

cpp: https://google.github.io/googletest/advanced.html

A belief that the ability of LLMs to generate parameterizations is intrinsically helpful to a degree which cannot be trivially achieved in most mainstream programming languages/test frameworks may be an indicator that an individual has not achieved a substantial depth of experience.

com2kid 6 days ago | parent | next [-]

The useful part is generating the mocks. The various auto mocking frameworks are so hit or miss I end up having to manually make mocks which is time consuming and boring. LLMs help out dramatically and save literally hours of boring error prone work.

mgh95 6 days ago | parent [-]

Why mock at all? Spend the time making integration tests fast. There is little reason a database, queue, etc. can't be set up in a per-test group basis and be made fast. Reliable software is built upon (mostly) reliable foundations.

com2kid 5 days ago | parent | next [-]

Because if part of my tests involve calling an OpenAI endpoint, I don't want to pay .01 cent every time I run my tests.

Because my tests shouldn't fail when a 3rd party dependency is down.

Because I want to be able to fake failure conditions from my dependencies.

Because unit tests have value and mocks make unit tests fast and useful.

Even my integration tests have some mocks in them, especially for any services that have usage based pricing.

But in general I'm going to mock out things that I want to simulate failure states for, and since I'm paranoid, I generally want to simulate failure states for everything.

End to End tests are where everything is real.

mgh95 5 days ago | parent [-]

> Because if part of my tests involve calling an OpenAI endpoint, I don't want to pay .01 cent every time I run my tests.

This is a good time to think to yourself: do I need these dependencies? Can I replace them with something that doesn't expose vendor risk?

These are very real questions that large enterprises grapple with. In general (but not always), orgs that view technology as the product (or product under test) will view the costs of either testing or inhousing technology as acceptable, and cost centers will not.

> But in general I'm going to mock out things that I want to simulate failure states for, and since I'm paranoid, I generally want to simulate failure states for everything.

This can be achieved with an instrumented version of the service itself.

com2kid 5 days ago | parent [-]

> This is a good time to think to yourself: do I need these dependencies? Can I replace them with something that doesn't expose vendor risk?

Given that my current projects all revolve solely around using LLMs to do things, yes I need them.

The entire purpose of the code is to call into LLMs and do something useful with the output. That said I need to gracefully handle failures, handle OpenAI giving me back trash results (forgetting fields even though they are marked required in the schema, etc), or just the occasional service outage.

Also integration tests only make sense once I have an entire system to integrate. Unit tests let me know that the file I just wrote works.

cornel_io 6 days ago | parent | prev | next [-]

There are thousands of projects out there that use mocks for various reasons, some good, some bad, some ugly. But it doesn't matter: most engineers on those projects do not have the option to go another direction, they have to push forward.

mgh95 6 days ago | parent [-]

In this context, why not refactor (and have your LLM of choice) write and optimize the integration tests for you? If the crux of the argument for LLMs is that it is capable of producing sufficient quality software and dramatically reduced costs, why not have it rewrite tests?

lanstin 6 days ago | parent | prev [-]

hmmmm. I do like integration tests, but I often tell people the art of modern software is to make reliable systems on top of unreliable components. And the integration tests should 100% include times when the network flakes out and drops 1/2 of replies and corrupts msgs and the like.

sethammons 5 days ago | parent | next [-]

Minor nit. I wouldn't call those failing systems tests integration tests.

Unit tests are for validation of error paths. Unit tests can leverage mocks or fakes. Need 3 retires with exponential back off, use unit tests and fakes. Integration tests should use real components. Typically, integration tests are happy path and unit are error paths.

Making real components fail and having tests validate failure handling in a more complete environment jumps from integration testing to resilience or chaos testing. Being able to accurately validate backoffs and retries may diminish, but validating intermediate or ending state can be done with artifact monitoring via sinks.

There is unit-integration testing which fakes out as little as possible but still fakes out some edges. The difference being that the failures are introduced via fake vs managing actual system components. If you connect to a real db on unit-integration tests, you typically wouldn't kill the db or use Comcast to slow the network artificially. That would be reserved for the next layer in the test pyramid.

mgh95 6 days ago | parent | prev [-]

> I do like integration tests, but I often tell people the art of modern software is to make reliable systems on top of unreliable components.

There is a dramatic difference between unreliable in the sense of S3 or other services and unreliable as in "we get different sets of logical outputs when we provide the same input to a LLM". In the first, you can prepare for what are logical outcomes -- network failures, durability loss, etc. In the latter, unless you know the total space of outputs for a LLM you cannot prepare. In the operational sense, LLMs are not a system component, they are a system builder. And a rather poor one, at that.

> And the integration tests should 100% include times when the network flakes out and drops 1/2 of replies and corrupts msgs and the like.

Yeah, it's not that hard to include that in modern testing.

imtringued 5 days ago | parent | prev | next [-]

99% of the work in testing is coming up with test scenarios and test cases. 95% of the code is just dealing with setting up input and output data, 4% is calling the code you want to test and the final assert is often just a single line of code.

I'm not sure what depth of experience has to do with any of this, since it is busy work that costs a lot of time. A form with 120 fields is a form with 120 fields. There is no way around coming up with the several dozens of test cases that you're going to test without filling out almost all of the fields, even the ones that are not relevant to the test itself, otherwise you're not really testing your application.

VectorLock 6 days ago | parent | prev [-]

Parameterized tests are good, but I think he might be talking about exercising all the corner cases in the logic of your function, which to my knowledge almost no languages can auto-generate for but LLMs can sorta-ish figure it out.

mgh95 6 days ago | parent [-]

We are talking about basic computing for CRUD apps. When you start needing to rely upon "sorta-ish" to describe the efficacy or a tool for such a straightforward and deterministic use case, it may be an indicator you need to rethink your approach.

VectorLock 6 days ago | parent [-]

If you want to discount a tool that may save you an immense amount of time because you might have to help it along the fast few feet, thats up to you.

If you can share a tool that can analyze a function and create a test for all corner cases in a popular language, I'm sure some people would be interested in that.

mgh95 6 days ago | parent [-]

You should look up intellitest and reshaper test generator. Products exist for this.

imtringued 5 days ago | parent [-]

Maybe you should have brought that up earlier instead of acting smug and burying the lede? It's also pretty telling that you didn't elaborate this further and kept your comment short.

What you should have said is that some parameterized test generators do automated white box testing where they look at your code similar to a fuzzer and try to find the test cases automatically. Your first link is literally just setting up an array with test cases, which basically means you'd have to use an LLM to quickly produce the test cases anyway, which makes parameterized testing sound exceedingly pathetic.

https://learn.microsoft.com/en-us/visualstudio/test/generate...

>IntelliTest explores your .NET code to generate test data and a suite of unit tests. For every statement in the code, a test input is generated that will execute that statement. A case analysis is performed for every conditional branch in the code. For example, if statements, assertions, and all operations that can throw exceptions are analyzed. This analysis is used to generate test data for a parameterized unit test for each of your methods, creating unit tests with high code coverage. Think of it as smart fuzz testing that trims down the inputs and test cases to what executes all your logic branches and checks for exceptions.

mgh95 5 days ago | parent [-]

> Maybe you should have brought that up earlier instead of acting smug and burying the lede? It's also pretty telling that you didn't elaborate this further and kept your comment short.

I thought people were generally competent within the areas they discuss and are aware of the tooling within their preferred ecosystem. I apologize if that is not the case.

VectorLock 5 days ago | parent [-]

Reshaper huh...

"With AI Assistant, you can generate unit tests for C# methods."

https://www.jetbrains.com/help/resharper/Generate_tests.html

goosejuice 6 days ago | parent | prev [-]

We're not a licensed profession with universally defined roles. It's whatever the speaker wants it to be given how wildly it varies.