Remix.run Logo
materielle 5 days ago

I think a problem with AI productivity metrics is that a lot of the productivity is made up.

Most enterprise code involves layers of interfaces. So implementing any feature requires updating 5 layers and mocking + unit testing at each layer.

When people say “AI helps me generate tests”, I find that this is what they are usually referring to. Generating hundreds of lines of mock and fake data boilerplate in a few minutes, that would otherwise take an entire day to do manually.

Of course, the AI didn’t make them more productive. The entire point of automated testing is to ensure software correctness without having to test everything manually each time.

The style of unit testing above is basically pointless. Because it doesn’t actually accomplish the goal. All the unit tests could pass and the only thing you’ve tested is that your canned mock responses and asserts are in-sync in the unit testing file.

A problem with how LLMs are used is that they help churn through useless bureaucratic BS faster. But the problem is that there’s no ceiling to bureaucracy. I have strong faith that organizations can generate pointless tasks faster than LLMs can automate them away.

Of course, this isn’t a problem with LLMs themselves, but rather an organization context in which I see them frequently being used.

refulgentis 5 days ago | parent [-]

I think it's appropriate to be skeptical with new tools, and being appropriately, respectfully, prosocially, skeptical, point out failure modes. Kudos.

Something that crosses my mind is if AI generating tests necessitates that it only generates tests with fakes and stubs that exercise no actual logic, the expertise required to notice that, and if it is correctable.

Yesterday, I was working on some OAuth flow stuff. Without replayed responses, I'm not quite sure how I'd test it without writing my own server, and I'm not sure how I'd develop the expertise to do that without, effectively, just returning the responses I expected.

It reminds me that if I eschewed tests with fakes and stubs as untrustworthy in toto, I'd be throwing the baby with the bathwater.