Remix.run Logo
BatteryMountain a day ago

Some fuel for the fire: the last two months mine has become way better, one-shotting tasks frequently. I do spend a lot of time in planning mode to flesh out proper plans. I don't know what others are doing that they are so sceptical, but from my perspective, once I figured it out, it really is a massive productivity boost with minimal quality issues. I work on a brownfield project with about 1M LoC, fairly messy, mostly C# (so strong typing & strict compiler is a massive boon).

My work flow: Planning mode (iterations), execute plan, audit changes & prove to me the code is correct, debug runs + log ingestion to further prove it, human test, human review, commit, deploy. Iterate a couple of times if needed. I typically do around three of these in parallel to not overload my brain. I have done 6 in the past but then it hits me really hard (context switch whiplash) and I start making mistakes and missing things the tool does wrong.

To the ones saying it is not working well for them, why don't you show and tell? I cannot believe our experiences are so fundamentally different, I don't have some secret sauce but it did take a couple of months to figure out how to best manipulate the tool to get what I want out of it. Maybe these people just need to open their minds and let go of the arrogance & resistance to new tools.

wtetzner 19 hours ago | parent | next [-]

> My work flow: Planning mode (iterations), execute plan, audit changes & prove to me the code is correct, debug runs + log ingestion to further prove it, human test, human review, commit, deploy. Iterate a couple of times if needed.

I'm genuinely curious if this is actually more productive than a non-AI workflow, or if it just feels more productive because you're not writing the code.

steveklabnik 10 hours ago | parent | next [-]

One reason why it can be more productive is that it can be asynchronous. I can have Claude churning away on something while I do something else on a different branch. Even if the AI takes as long as a human to do the task, we're doing a parallelism that's not possible with just one person.

nosianu 16 hours ago | parent | prev [-]

Here is a short example from my daily live, A D96A INVOIC EDI message containing multiple invoices transformed into an Excel file.

I used the ChatGPT web interface for this one-off task.

Input: A D96A INVOIC text message. Here is what those look like, a short example, the one I had was much larger with multiple invoices and tens of thousands of items: https://developer.kramp.com/edi-edifact-d96a-invoic

The result is not code but a transformed file. This exact scenario can be made into code easily though by changing the request from "do this" to "provide a [Python|whatever] script to do this". Internally the AI produces code and runs it, and gives you the result. You actually make it do less work if you just ask for the script and to not run them.

Only what I said. I had to ask for some corrections because it made a few mistakes in code interpretations.

> (message uploaded as file)

> Analyze this D.96A message

> This message contains more than one invoice, you only parsed the first one

(it finds all 27 now)

> The invoice amount is in segment "MOA+77". See https://www.publikationen.gs1-germany.de/Complete/ae_schuhe/... for a list of MOA codes (German - this is a German company invoice).

> Invoice 19 is a "credit note", code BGM+381. See https://www.gs1.org/sites/default/files/docs/eancom/ean02s4/... for a list of BGM codes, column "Description" in the row under "C002 DOCUMENT/MESSAGE NAME"

> Generate Excel report

> No. Go back and generate a detailed Excel report with all details including the line items, with each invoice in a separate sheet.

> Create a variant: All 27 invoices in one sheet, with an additional column for the invoice or credit note number

> Add a second sheet with a table with summary data for each invoice, including all MOA codes for each invoice as a separate column

The result was an Excel file with an invoice per worksheet, and meta data in an additional sheet.

Similarly, by simply doing what I wrote above, at the start telling the AI to not do anything but to instead give me a Python script, and similar instructions, I got a several hundred lines ling Python script that processed my collected DESADV EDI messages in XML format ("Process a folder of DESADV XML files and generate an Excel report.")

If I had had to actually write that code myself, it would have taken me all day and maybe more, mostly because I would have had to research a lot of things first. I'm not exactly parsing various format EDI messages every day after all. For this, I wrote a pretty lengthy and very detailed request though, 44 long lines of text, detailing exactly which items with which path I wanted from the XML, and how to name and type them in the result-Excel.

ChatGPT Query: https://pastebin.com/1uyzgicx

Result (Python script): https://pastebin.com/rTNJ1p0c

noufalibrahim 19 hours ago | parent | prev | next [-]

As a die hard old schooler, I agree. I wasn't particularly impressed by co-pilot though it did so a few interesting tricks.

Aider was something I liked and used quite heavily (with sonnet). Claude Code has genuinely been useful. I've coded up things which I'm sure I could do myself if I had the time "on the side" and used them in "production". These were mostly personal tools but I do use them on a daily basis and they are useful. The last big piece of work was refactoring a 4000 line program which I wrote piece by piece over several weeks into something with proper packages and structures. There were one or two hiccups but I have it working. Tool a day and approximately $25.

spreiti 19 hours ago | parent | prev | next [-]

I have basically the same workflow. Planning mode has been the game changer for me. One thing I always wonder is how do people work in parallel? Do you work in different modules? Or maybe you split it between frontend and backend? Would love to hear your experience.

christophilus 4 hours ago | parent [-]

I plan out N features at a time, then have it create N git worktrees and spawn N subagents. It does a decent job. I find doing proper reviews on each worktree kind of annoying, though, so I tend to pull them in one at a time and do a review, code, test, feedback loop until it’s good, commit it, pull in the next worktree and repeat the process.

9rx 21 hours ago | parent | prev | next [-]

> why don't you show and tell?

How do you suggest? A a high level, the biggest problem is the high latency and context switches. It is easy enough to get the AI to do one thing well. But because it takes so long, the only way to derive any real benefit is to have many agents doing many things at the same time. I have not yet figured out how to effectively switch my attention between them. But I wouldn't have any idea how to turn that into a show and tell.

hdjrudni 21 hours ago | parent [-]

I don't know how ya'all are letting the AIs run off with these long tasks at all.

The couple times I even tried that, the AI produced something that looked OK at first and kinda sorta ran but it quickly became a spaghetti I didn't understand. You have to keep such a short leash on it and carefully review every single line of code and understand thoroughly everything that it did. Why would I want to let that run for hours and then spend hours more debugging it or cleaning it up?

I use AI for small tasks or to finish my half-written code, or to translate code from one language to another, or to brainstorm different ways of approaching a problem when I have some idea but feel there's something better way to do it.

Or I let it take a crack when I have some concrete failing test or build, feeding that into an LLM loop is one of my favorite things because it can just keep trying until it passes and even if it comes up with something suboptimal you at least have something that compiles that you can just tidy up a bit.

Sometimes I'll have two sessions going but they're like 5-10 minute tasks. Long enough that I don't want to twiddle my thumbs for that long but small enough that I can rein it in.

wickedsight 20 hours ago | parent [-]

I find it interesting you're all writing 'the AI' as if it's a singular thing. There's a myriad of ways to code with a myriad of AI's and none of them are identical. I use a Qwen 3 32B with Cline in VSCode for work, since I can't use cloud based AI. For personal projects, I use Codex in the cloud. I can let Codex perform some pretty complicated tasks and get something usable. I can ask Qwen something basic and it ends up in a loop, delivering nothing useful.

Then there's the different tasks people might ask from it. Building a fully novel idea vs. CRUD for a family planner might have different outcomes.

It would be useful if we could have more specific discussions here, where we specify the tools and the tasks it either does or does not work for.

DANmode 19 hours ago | parent | prev | next [-]

This.

If you’re not treating these tools like rockstar junior developers, then you’re “holding it wrong”.

wtetzner 19 hours ago | parent [-]

The problem I have with this take is that I'm very skeptical that guiding several junior developers would be more productive than just doing the work myself.

With real junior developers you get the benefit of helping develop them into senior developers, but you really don't get that with AI.

DANmode 19 hours ago | parent [-]

So then do your thing, while it’s doing scaffolding of your next thing.

Also: are you sure?

There’s as many of them as you’re talented enough to asynchronously instruct,

and you can tell them the boundaries within which to work (or not),

in order to avoid too little or too much being done for you to review and approve effectively.

llmslave2 20 hours ago | parent | prev [-]

[flagged]