Remix.run Logo
jitl a day ago

The worst is when I have to baby-sit someone else’s AI. It’s so frustrating to get tagged to review a PR, open it up, and find 400 lines of obviously incorrect slop. Some try to excuse by marking the PR [vibe] but like what the hell, at least review your own goddamn ai code before asking me to look at it. Usually I want to insta reject just for the disrespect for my time.

stillsut 16 hours ago | parent | next [-]

I've got some receipts for what I think is good vibe coding...

I save every prompt and associated ai-generated diff in a markdown file for a steganography package I'm working on.

Check out this document: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

In particular, under v0.1.0 see `decode-branch.md` prompt and it's associated generated diff which implements memoization for backtracking while performing decoding.

It's a tight PR that fits the existing codebase and works well, you just need a motivating example you can reproduce which can help me you quickly determine if the proposed solution is working. I usually generate 2-3 solutions initially and then filter them quickly based on a test case. And as you can see from the prompt, it's far from well formatted or comprehensive, just "slap dash" listing of potentially relevant information similar to what would be discussed at an informal whiteboard session.

dawnerd a day ago | parent | prev | next [-]

I insta reject. It’s ridiculous how bad it’s made some devs. Are they even testing their work before tossing over the wall?

scrame a day ago | parent [-]

were they before?

dawnerd a day ago | parent | next [-]

Yep. Claude/copilot definitely has made people lazy.

mrheosuper a day ago | parent | prev [-]

I would have a talk to whoever create a PR without testing anything.

whynotminot a day ago | parent | prev | next [-]

Hold up, people are starting to mark PRs with [vibe] as in “I don’t stand behind this, good luck.” ??

I do not care if engineers on my team are using AI. In fact, I think they should be. This is the new world and we need to get used to it.

But it’s still your work, your responsibility. You still have to own it. No cop outs.

jitl 42 minutes ago | parent [-]

I think the [vibe] is more like, “Watch out this pr is more sus than usual” but usually it’s too sus

dimator a day ago | parent | prev | next [-]

Recently in that exact situation.

Junior dev vibe codes some slop to solve a problem that was barely a problem. The "solution" to that not-a-problem was 2500 lines of slop. Unused variables, spaghetti logic, unit tests that were clearly write-once, read-never.

The slop at that point is the meta-problem, the real problem becomes me trying to convince (through multiple rounds of code review) that this was not maintainable code, and not a tenable development model.

All those rounds of review take TIME and mental effort. At some point, when the code review takes more effort than the code owner contributed, the value proposition is shot.

Generate the code however you want (llm, bit twiddling, whatever), but the effort and care must be there. If you can't use an llm skillfully and vouch for the output, it's gonna be a no from me dawg

lawn a day ago | parent | prev | next [-]

We need to normalize to reject crap PRs. If we don't then things will only continue.

virgil_disgr4ce a day ago | parent | prev | next [-]

lol if I saw a teammember create a PR with "[vibe]" in the title I would: 1) Gape in disbelief 2) Shout obscenities as loud as I can 3) Take a deep breath 4) reject the PR and seriously consider firing them.

Yeah "consider firing" is a bit of an extreme kneejerk reaction, but I just feel like we HAVE to create consequences for this nonsense.

xdennis a day ago | parent | prev | next [-]

> and find 400 lines of obviously incorrect slop

I call that a good day. I've seen people push 2000 line PRs. The worst was 5000 lines. FML.

jitl 18 hours ago | parent [-]

I don’t mind a 2000 line PR if it’s quality code written by a human.

CuriouslyC a day ago | parent | prev [-]

Have your own agent do first pass code reviews, they catch that stuff every time.

goldenCeasar a day ago | parent | next [-]

And then the PR owner's agent can fix it and then after some number of iterations you get a PR for a new and mysterious system.

CuriouslyC a day ago | parent | next [-]

It's only new and mysterious if you don't have a tight spec and guardrails.

angxiaobi a day ago | parent [-]

There seems to be a gap between the theory you're advocating...which I actually agree with and the practical execution in your own public projects which you talk about heavily.

I haven't been able to get any of your recently published projects (Scribe, volknut, a few others) running successfully on linux, whether following your documentation, using the tagged releases, or reverse engineering the true functionality and CLI arguments working from your provided source code I found to have wasted my time.

It's difficult to believe you when your own documentation is entirely wrong or 404'd.

I'd genuinely love to see your projects in action, since the concepts sound promising, but you need to back it up with shipping properly.

CuriouslyC 17 hours ago | parent [-]

There's a reason I haven't dropped release pages yet. These projects are under heavy development and I push to github to checkpoint mostly rather than release. I'm sorry if you feel like I've misled you, I've tried to be clear that what I'm sharing now is to understand what I'm doing, to give visibility, and it's not ready yet. I'm committed to delivering great software, and when I tell you it's ready you can rest assured that it will work.

Understand that I'm one man working on 20 projects simultaneously, with 5+ under active development at any one moment, so release stabilization and cadence will take a little bit to lock in.

theshrike79 a day ago | parent | prev [-]

AI PR tarpit, anyone? :)

hamdingers a day ago | parent | prev | next [-]

> they catch that stuff every time

This is why AI code review continues to be mostly useless in my experience. If you instruct an LLM to make PR comments it will come up with some, every time, even when none are warranted.

CuriouslyC a day ago | parent [-]

So give them a schema and get them to give a relevance score, then filter below a threshold. Works better than trying to get them not to reply.

hamdingers a day ago | parent [-]

Seems like sound advice, unfortunate that the developers of the review tools I've had brushes with did not take it.

jitl a day ago | parent | prev [-]

I try to avoid participating in the bullshit economy as much as possible