Automating Myself Out of Development

noelwelsh 9 hours ago | parent | next [-]

I wish people would describe in more detail the tasks they use LLMs to code. My experience is that simple components in an existing architecture are fine, but anything requiring architectural considerations quickly becomes a mess. On my projects (e.g. a ui framework), running multiple agents in parallel would just increase the speed at which it can stuff up the project.

▲

germanptr 7 hours ago | parent | next [-]

I get this question a lot, and I found it hard to answer briefly, so I ended up writing a longer post about how I work:

https://www.trigosec.com/insights/mob-programming-for-one/

The short version is that I don’t let AI agents work unsupervised on my code. I treat them like participants in a mob programming session instead of autonomous developers. Different agents get different roles (implementer, reviewer, architect, security reviewer, etc.), and I stay involved throughout the process.

I also agree with your point about architecture. Generating isolated components is relatively easy; preserving and evolving the architectural boundaries across a larger codebase is much harder.

We’re still missing a good way to express and measure architectural quality. Until then, architecture heavy work requires much closer supervision than implementation heavy work

	▲	Swizec an hour ago \| parent \| next [-]
		> We’re still missing a good way to express and measure architectural quality Architectural complexity[1]! There’s several really good papers on this. Unfortunately it never caught on and we don’t have great automated tools to spit out a number. Also the majority of people just don’t care enough. Research in this field kinda died out when we invented microservices and started treating those as a silver bullet to The Architecture Problem (it’s not [2]) [1] https://swizec.com/blog/why-taming-architectural-complexity-... [2] https://youtu.be/y8OnoxKotPQ
	▲	vslira 22 minutes ago \| parent \| prev [-]
		> The short version is that I don’t let AI agents work unsupervised on my code. I treat them like participants in a mob programming session instead of autonomous developers. I wonder if OS maintainers would have a leg up in defining workflows to better leverage this. Of course, OS contributors are autonomous developers, but maybe a trick or two might transfer across

▲

amelius an hour ago | parent | prev | next [-]

I personally limit LLMs to single files only at the moment. Self-contained components.

Using LLMs in a larger scope can sometimes work, but it has the real risk of turning a project into a mess after which you will have to undo the work and lose a lot of time.

Also, using LLMs this way with less clear boundaries will make reading and maintaining the code more cumbersome.

▲

pjmlp 2 hours ago | parent | prev | next [-]

Me when not trying to meet management expectations, only as smarter code completion, formatting code, basic code analysis, and helping copy pasting code examples between languages.

Me when meeting management expectations, agent orchestration tools like Boomi and Workato calling into tools, doing with AI what a few years ago would be done with BPEL.

▲

davidcann 7 hours ago | parent | prev | next [-]

I built this with 94% written by coding agents: https://buildermark.dev/

The complete log of all prompts and commits is here: https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN

▲

MonstraG 6 hours ago | parent [-]

It seems that pages 2-5 on

https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN/...

still show content of page 1

	▲	davidcann 6 hours ago \| parent [-]
		Thanks for the report. I messed up the CDN settings. It looks fixed now.

▲

properbrew 9 hours ago | parent | prev | next [-]

I used LLMs to develop Whistle Enterprise (https://whistle-enterprise.com) from the ground up, from scratch.

It's taken _a lot_ of time and effort, but this is an example of what can be developed using LLMs alone.

You have to have dedication and a goal to reach, but you can absolutely build anything if you're building with the right foundations in mind.

▲

leguy 28 minutes ago | parent | next [-]

neat. I saw the "no bot joins the call". Is it obvious to others in the virtual meeting that you are using this tool?

▲

ryanackley 6 hours ago | parent | prev [-]

I think the relevant question isn’t what can be built but the amount of effort in comparison to doing this the old fashioned way.

What do you think the productivity gain was from using an LLM? This question assumes you’re already an experienced developer.

	▲	motoroco 2 hours ago \| parent \| next [-]
		There’s no free lunch, it takes time and effort still. And expertise if you need it to be robust. In terms of velocity, let me offer some numbers. In 6 months I generated >150k lines of code and merged 10k PRs to ship and iterate on https://plotalong.app I follow best practices and isolate agents to continuously deployed dev environments, semi-manually review PRs and gate the release process between multiple protected envs. The project is getting close to 500 end-to-end tests in Playwright. That’s just working nights and weekends. Before AI, it took my team at the office 4 years to produce this much work. There are some qualitative differences but the speed and results are real
	▲	andai 5 hours ago \| parent \| prev [-]
		n=1 but, a friend of mine spent the last few months working on an experimental music software with Claude. What he built is amazing and far beyond my abilities (I have been programming for 20 years). He doesn't know any programming. In fact, it's far beyond what I would even attempt, because I've just spent two decades building up a data bank of how hard things are supposed to be. He doesn't know it's supposed to be hard, so he just does it.

▲

TheBigSalad 3 hours ago | parent | prev | next [-]

You have to make those architectural decisions and feed them to the agents. Be very specific. That's been my experience.

▲

pipes 2 hours ago | parent | prev | next [-]

I found that this guys stuff has really helped me:

https://youtu.be/-QFHIoCo-Ko?is=FYYdukWluYX3vdQL

Worth a watch.

▲

nullbio 9 hours ago | parent | prev | next [-]

It's great for people who are just maintaining something. Less so for someone building something from scratch, in the earlier phases.

▲

Npovview 8 hours ago | parent | prev [-]

There are hour long youtube videos where people explain the process by using a complex toy project. Search for them.

▲

Supermancho an hour ago | parent | prev | next [-]

Why are you using Claude over OpenAI? ChatGpt 5.5 is better. I (and any employer) would rather you use Docker than some random EC2, even if you can't reach it with your phone...which sounds silly. The integration with Github Issues/JIRA is full of handwavy "if you do it just right" vibes, as it always is.

I find it hard to take these Claude adverts seriously.

▲

gnunicorn 6 hours ago | parent | prev | next [-]

Interestingly, despite it being much more detailed and a lot more process and procedure than what I currently do - which is more akin to the version 0 described, but in parallel - we come up at the same final problem: reviews and quality assurance.

I sign off the code I merged, part of company policy but also just to be sure it is actually decent. But reviewing has become the real draining bottleneck: even stacked PRs, if that total 5-6k lines is not a 5min job. Even if I brainstormed and set the plan, that's really the part that doesn't scale right now for me in this. But the author is very shy about that: either the changes arent that big in the end or they trust the process enough to review in a more casual manner. Being equally untrusting I can't do that ...

▲

philbo 3 hours ago | parent | next [-]

For decades, engineers understood that large code reviews are harder than small ones. Out of both politeness and a desire to receive better code reviews, we learned to break our large changes into smaller chunks. Some engineers took things even further and replaced code reviews with pair programming. But then LLMs showed up and everyone seems to have forgotten those lessons.

They can be still be applied now using coding agents, if you're willing to push back against the default setup and change your mode of thinking a little bit. Of course it doesn't help that an entire industry is dedicated to persuading us that maximizing token spend is the only way to get shit done.

I appreciate this probably seems like an extremist take, but I wrote some more about it here in case there's anybody out there who identifies with it:

https://philbooth.me/blog/agentic-coding-and-mental-models

	▲	firegodjr an hour ago \| parent \| next [-]
		I think that's reasonable. My only gripe is that making small sets of changes is often faster to do by hand than waiting on llm reasoning, so I've found it amounts to very little speedup.
	▲	aocallaghan17 2 hours ago \| parent \| prev [-]
		Agree with this completely. This push for more autonomy I think is the complete wrong direction for how to use LLMs. I want less code to maintain not more that I don't even fully understand. I think research and very supervised coding with lots of guardrails is the way to actually gain productivity from these tools.

▲

strogonoff 6 hours ago | parent | prev | next [-]

Proper review should take longer than writing it yourself, because you need to know the correct solution, understand the proposed solution, and evaluate the difference between the two. When designing it yourself, you just need to know the correct solution and write it, and with modern high-level languages and IDEs with autocomplete writing it is hardly a bottleneck.

▲

minihat 4 hours ago | parent [-]

It is harder to solve a sudoku than verify a solution's correctness. I find similar benefits occasionally when coding with LLMs.

	▲	layer8 an hour ago \| parent \| next [-]
		I disagree under the following circumstances, which in my experience is the common case: You don’t know from the outset all relevant considerations that go into implementing something. Coding yourself is an exploration process of those considerations. Being shown a finished solution doesn’t let you see and understand all the considerations and the possible options that you’d have contemplated when implementing it yourself. When reviewing, you still have to do that exploratory thinking to weigh the possible options. And the fact that you have to do that exploration purely mentally rather than in a process of working with code arguably makes it harder (similar to contemplating alternative solutions to a Sudoku purely mentally, actuallu). There rarely is a single correct way of implementing some requirement or feature. It’s a trade-off between compromises, not binary correct or incorrect like a Sudoku puzzle. The insights that the exploration give you may even lead you to implement something significantly different from what you originally set out to.
	▲	skydhash 2 hours ago \| parent \| prev [-]
		Sudoku’s constraints are knownn and easy to build an harness for. Software has a more malleable structure. An harness is hard to build and the tests cases for the constraints can be a lot.

▲

nisabek 5 hours ago | parent | prev [-]

If I'm attentive during spec/plan creation I sort of build this "expectation" of what the actual PR will look like, the mental model of it. Then it's somewhat easier to review. But the mental load is brutal tbh, and still not sure if it's "worth it"

▲

pydry 5 hours ago | parent | prev | next [-]

>Automating myself out of development

>I want to start by saying that I’m neither an AI-fanatic

Kind of like saying you are a fanatic before saying you aren't.

I don't think theres too much here (e.g. "spec driven development") I haven't seen elsewhere.

▲

yieldcrv 9 hours ago | parent | prev | next [-]

I don't know if I’m overly critical but there’s gotta be a middle ground between totally AI pilled people that otherwise have no talents, and control freak veteran developers who cant let go

My current process is also using Github projects in a normal scrum style way, with many tickets written or fleshed out and state managed by the LLM, and it doubling as the memory system

Completely leapfrogging all these other open and closed source concoctions and being more effective

But its effective enough that I don’t need OP’s final form state of still approving everything

Auto-mode is fine. Worktrees are built into Claude Code now. I just tell it to classify tickets as sequential or parallel possible and spawn subagents to tackle all of the tickets in the todo list

They all get their own context window its pretty perfect now

in the meantime I work in a couple tabs of Claude Design for different flows of any client side app. My philosophy has been that devs could pick up graphic and UI/UX design easily, its just still a full time job to make variations of layouts and portray their states.

UI/UX is not a full time job anymore.

And I use Claude chat to flesh out aspects of the overall idea

I think you may be overcomplicating your workflow in the concluding state.

Overall I agree that planning and intention is now most of the time, before a 10 subagent precision strike is initiated

▲

thi2 5 hours ago | parent | next [-]

There are tons of people, those are just not as vocal.

▲

nisabek 5 hours ago | parent | prev | next [-]

Could be (the overcomplicating part), I'm just not yet comfortable loosing the mental model of the final application. At least not in all types of tickets. Are you not seeing that?..

	▲	yieldcrv 3 hours ago \| parent [-]
		I focus on one side project at a time, alongside work applications Both are giving me skillsets to excel in the other domain I watch the subagents, push back on some choices, look at commits and glance at pull requests

▲

ai_fry_ur_brain 4 hours ago | parent | prev [-]

All these people saying UI/UX is dead, then I see their designs and they're absolutely the worst (but they're always swearing by how incredible it is).

Sorry access to an LLM (even if it could center a div reliably and make a responsive designs, it can't) does not give you taste, intuition or make you good at building user interfaces. You people/sloppers have no idea the amount of sweat that gets poured into great UX.

Its insulting when you people say these things and Im not even a designer or frontend dev.

I actually think UI/UX designers and devs will be the last to fall. I will want beautiful products that were built by beautiful minds, thats how you will set yourself apart from the slop. And fortunately it will be even easier when 80% of everything is half assed cranked out UI by llm design tools. The contrast is already glaring.

	▲	yieldcrv 3 hours ago \| parent [-]
		I’ve seen that slop but Claude Design has barely been out for a month And it’s fulfilled my needs better than v0, lovable, playwright via LLM or just iterating in the coding LLM. I’ve worked with graphic designers my whole career and have also contracted design agencies to do style guides and collaborate on branding and layouts. I’ve gotten the output that I’m looking for with Claude Design eventually you’ll see examples but its not in my purview to publicly link any of my projects as being vibe coded

▲

brcmthrowaway 8 hours ago | parent | prev | next [-]

More Yegge tier psychosis.

▲

general1465 5 hours ago | parent | prev [-]

I am completely calm regarding AI and development.

First nobody sane want to give their domain IP to OpenAI/Anthropic. That's why local AI will eventually prevail and flourish because people who actually have some IP will have no problem to buy 10k+ EUR machine to run some pretty good models on it. However if your main job is just doing CRUD stuff, then you are screwed.

Secondly hallucination is really Achilles heel of every LLM. Sure you can recreate an application which exists in thousand of variations on the internet, but the moment you will try to go more into domain knowledge you will start struggling more and more.

Try to make CAN driver for ESP32, easy it is probably going to work. Try to make CAN driver for STM32F7xx now the AI will start having a problem but probably will be able to produce something what is working after a lot of debugging. Now let's make CAN driver for MPC5555. AI will start writing fairy tales about registers which do not exist. All of processor above have reference manuals and sometimes example git repositories available on open internet.

	▲	abletonlive 40 minutes ago \| parent [-]
		> All of processor above have reference manuals and sometimes example git repositories available on open internet. okay? then give those reference manuals and git repositories? I haven't heard something know LLMs can't get around and figure out?