Remix.run Logo
827a 7 days ago

So, to give a tactile example that helped me recently: We have a frontend web application that was having some issues with a specific feature. This feature makes a complex chain of a maybe dozen API requests when a resource is created, conditionally based on certain things, and there's a similar process that happens when editing this resource. But, there was a difference in behavior between the creating and editing routes, when a user expected that the behavior would be the same.

This is crusty, horrible, old, complex code. Nothing is in one place. The entire editing experience was copy-pasted from the create resource experience (not even reusable components; literally copy-pasted). As the principal on the team, with the best understanding of anyone about it, even my understanding was basically just "yeah I think these ten or so things should happen in both cases because that's how the last guy explained it to me and it vibes with how I've seen it behave when I use it".

I asked Cursor (Opus Max) something along the lines of: Compare and contrast the differences in how the application behaves when creating this resource versus updating it. Focus on the API calls its making. It responded in short order with a great summary, and without really being specifically prompted to generate this insight it ended the message by saying: It looks like editing this resource doesn't make the API call to send a notification to affected users, even though the text on the page suggests that it should and it does when creating the resource.

I suspect I could have just said "fix it" and it could have handled it. But, as with anything, as you say: Its more complicated than that. Because while we imply we want the app to do this, its a human's job (not the AI's) to read into what's happening here: The user was confused because they expected the app to do this, but do they actually want the app to do this? Or were they just confused because text on the page (which was probably just copy-pasted from the create resource flow) implied that it would?

So instead I say: Summarize this finding into a couple sentences I can send to the affected customer to get his take on it. Well, that's bread and butter for even AIs three years ago right there, so off it goes. The current behavior is correct; we just need to update the language to manage expectations better. AI could also do that, but its faster for me to just click the hyperlink in Claude's output, jumps right to the file, and I make the update.

Opus Max is expensive. According to Cursor's dashboard, this back-and-forth cost ~$1.50. But let's say it would have taken me just an hour to arrive at the same insight it did (in a fifth the time): that's easily over $100. That's a net win for the business, and its a net win for me because I now understand the code better than I did before, and I was able to focus my time on the components of the problem that humans are good at.