Remix.run Logo
OtherShrezzing 3 days ago

I’m unclear how you’re hitting $1k/mo in personal usage. GitHub Copilot charges $0.04 per task with a frontier model in agent mode - and it’s considered expensive. That’s 850 coding tasks per day for $1k/mo, or around 1 per minute in a 16hr day.

I’m not sure a single human could audit & review the output of $1k/mo in tokens from frontier models at the current market rate. I’m not sure they could even audit half that.

Wowfunhappy 2 days ago | parent | next [-]

You don't audit and review all $1k worth of tokens!

The AI might write ten versions. Versions 1-9 don't compile, but it automatically makes changes and gets further each time. Version 10 actually builds and seems to pass your test suite. That is the version you review!

—and you might not review the whole thing! 20 lines in, you realize the AI has taken a stupid approach that will obviously break, so you stop reading and tell the AI it messed up. This triggers another ~5 rounds of producing code before something compiles, which you can then review, hopefully in full this time if it did a good job.

beefnugs 2 days ago | parent [-]

Thank you for this honesty. I mean if i try something 5 times and its a total failure every time, I would never in a million years think to try it another 5. (maybe i will give it another try when home hardware capable of a coding model is anywhere near affordable)

I guess I see why the salesmen dont mention this... but it seems really important for everyone to know?

Wowfunhappy a day ago | parent [-]

The scenario I described is still what I would consider "two shot" though—between attempts 1–10 I did not need to intervene.

But it's true that I'm always surprised when people talk about using Claude on the beach or whatever, I love Claude Code but I have to test and test and test again per each incremental feature.

elcritch 2 days ago | parent | prev | next [-]

I can easily hit the daily usage limits on Claude Code or Openai Codex by asking for more complex tasks to be done which often take relatively little time to review.

There's a lot of tokens used up quickly for those tools to query the code base, documentation, try changes, run commands, re-run commands to call tools correctly, fix errors, etc.

F7F7F7 2 days ago | parent | prev | next [-]

Audit and review? Sounds like a vibe killer.

7thpower 2 days ago | parent | prev [-]

Do people actually use GitHub copilot?

At any rate, I could easily go through that much with Opus because it’s expensive and often I’m loading the context window to do discovery, this may include not only parts of a codebase but also large schemas along with samples of inputs and outputs.

When I’m done with that, I spend a bunch of turns defining exactly what I want.

Now that MCP tools work well, there is also a ton of back and forth that happens there (this is time efficient, not cost efficient). It all adds up.

I have Claude code max which helps, but one of the reasons it’s so cheap is all of the truncation it does, so I have a different tool I use that lets me feed in exactly the parts of a codebase that I want to, which can be incredibly expensive.

This is all before the expenses associated with testing and evals.

I’m currently consulting, a lot of the code is ultimately written by me, and everything gets validated by me (if the LLM tells me about how something works, I don’t just take its word for it, I go look myself), but a lot of the work for me happens before any code is actually written.

My ability (usually clarity of mind and patience) to review an LLMs output is still a gating factor, but the costs can add up quickly.

adithyassekhar 2 days ago | parent | next [-]

> Do people actually use GitHub copilot?

I use it all the time. I am not into claude code style agentic coding. More of the "change the relevant lines and let me review" type.

I work in web dev, with vs code I can easily select a line of code that's wrong which I know how to fix but honestly tired to type, press Ctrl+I and tell it to fix. I know the fix, I can easily review it.

Gpt 4.1 agent mode is unlimited in the pro tier. It's half the cost of claude, gemini, and chatgpt. The vs code integration alone is worth it.

Now that is not the kind of AI does everything coding these companies are marketing and want you to do, I treat it like an assistant almost. For me it's perfect.

7thpower 2 days ago | parent [-]

That’s good to know, I haven’t tried it for a few years.

ModernMech 2 days ago | parent | prev [-]

I trust Copilot way more than any agentic coder. First time I used Claude it went through my working codebase and tried to tell me it was broken in all these places it wasn't. It suggested all these wrong changes that if applied would have ruined my life. Given that first impression, it's going to take a lot to convince me agentic coding is a worthwhile tool. So I prefer Copilot because it's a much more conservative approach to adding AI to my workflow.