| ▲ | AstroBen a day ago |
| It's an impossible thing to disprove. Anything you say can be countered by their "secret workflow" they've figured out. If you're not seeing a huge speedup well you're just using it wrong! The burden of proof is 100% on anyone claiming the productivity gains |
|
| ▲ | anonzzzies a day ago | parent | next [-] |
| I go to meetups and enjoy myself so much; 80% of people are showing how to install 800000000 MCPs on their 92gb macbook pros, new RAG memory, n8n agent flows, super special prompting techniques, secret sauces, killer .md files, special vscode setups and after that they still are not productive vs just vanilla claude code in a git repos. You get people saying 'look I only have to ask xyz... and it does it! magic' ; then you just type in vanilla CC 'do xyz' and it does exactly the same thing, often faster. |
| |
| ▲ | mikestorrent a day ago | parent | next [-] | | This was always the case. People obsessing over keyboards, window managers, emacs setups... always optimizing around the edges of the problem, but this is all taking an incredible amount of their time versus working on real problems. | | |
| ▲ | sheepscreek 21 hours ago | parent | next [-] | | Yes, the thing they realize much later in life is that perhaps they enjoyed the act of gardening (curating your tools, workflows, etc) much more than farming (being downright focused and productive on the task at hand). Sadly gardening doesn’t pay the bills! | | |
| ▲ | anonzzzies 20 hours ago | parent | next [-] | | yep, and I have the same thing, but then I am not going to tell everyone it is super productive for the actual task of farming. I say that I have a hobby farming (which I do) and talk about my tools and my meager yields (which won't make any money if sold). I am not going to say that my workflows are so productive while my neighbour who is a professional farmer just has old crap and just starts and works from 5 am to 9 pm making a living of his land. | |
| ▲ | hdjrudni 21 hours ago | parent | prev | next [-] | | If I only spend $1000 on hydroponics and 3 weeks tending to a vertical garden I can grow a $1 head of lettuce FOR FREE! | | |
| ▲ | ben_w 12 hours ago | parent | next [-] | | I tried growing lettuce in some cut up plastic bottles at university in soil from the nearby footpath, I think even with the cheap student approach I spent more on the (single pack of) seeds than a fully grown lettuce costs, and only managed about four individual leaves that were only about 1cm by 5cm. | |
| ▲ | DANmode 19 hours ago | parent | prev [-] | | What if I haven’t spent anything, and I’m making money with lettuce I grew in the woods? (or, in Anthropic/sama’s backyards) |
| |
| ▲ | fc417fc802 14 hours ago | parent | prev [-] | | I like farming but a lot of the tools are annoying to use so I find myself tinkering with them (gardening in your analogy I guess). It's not that I prefer tinkering in the shop to farming. More that I just have very little patience for tools that don't conform to the ways in which I think about the world. |
| |
| ▲ | songodongo 15 hours ago | parent | prev | next [-] | | Same thing happens in music production. If only I had this guitar, or that synth, or these plugins… | | |
| ▲ | multjoy 14 hours ago | parent [-] | | Gear Acquisition Syndrome is a very different problem. Even if you haven't cured the issue the new synth was meant to fix, at least you have a new synth. |
| |
| ▲ | sehansen 13 hours ago | parent | prev | next [-] | | It's the four hobbies all over again: https://brooker.co.za/blog/2023/04/20/hobbies.html | |
| ▲ | whoiskevin 14 hours ago | parent | prev | next [-] | | A better keyboard is a hill I will die on. | | |
| ▲ | mikestorrent 8 hours ago | parent | next [-] | | I have a fantastic keyboard, but I'm not taking pictures of it, changing the keycaps, posting about it. It's a tool, not a fetish; that's how I differentiate these things. | |
| ▲ | butlike 13 hours ago | parent | prev [-] | | It's a keyboard attached to an article of clothing you put your head into so the keys drape over your shoulders. You then type, but also end up giving yourself a shoulder massage! |
| |
| ▲ | atakan_gurkan 20 hours ago | parent | prev [-] | | Yes, this happens quite often. So often that I wonder if it is among the symptoms of some psychiatric or neurological disorder. | | |
| ▲ | Bridged7756 13 hours ago | parent [-] | | It's just boredom probably. Obsessing over productivity tools is relatively more productive than say, something completely unrelated to your work. |
|
| |
| ▲ | abakker a day ago | parent | prev | next [-] | | That perfectly ties with my experience. Just direct prompts, with limited setup and limited context seem to work better or just as well as complex custom GPTs. There are not just diminishing, but inverting returns to complexity in GPTs | | |
| ▲ | serf a day ago | parent [-] | | limited prompts work well for limited programs, or already well defined and cemented source bases. once scope creeps up you need the guardrails of a carefully crafted prompt (and pre-prompts, tool hooks, AGENTS files, the whole gambit) -- otherwise it turns into cat wrangling rapidly. | | |
| ▲ | anonzzzies a day ago | parent [-] | | Not in our (30+ year old software company) experience and we have large code bases with a lot of scope creep ; more than ever as we can deliver a lot more for a lot less (lot more revenue / profit too). |
|
| |
| ▲ | PunchyHamster 14 hours ago | parent | prev [-] | | No, no, you misunderstand, that's still massive productivity improvement compared to them being on their own with their own incompetence and refusal to learn how to code properly | | |
|
|
| ▲ | paodealho a day ago | parent | prev | next [-] |
| This gets comical when there are people, on this site of all places, telling you that using curse words or "screaming" with ALL CAPS on your agents.md file makes the bot follow orders with greater precision. And these people have "engineer" on their resumes... |
| |
| ▲ | electroglyph a day ago | parent | next [-] | | there's actually quite a bit of research in this field, here's a couple: "ExpertPrompting: Instructing Large Language Models to be Distinguished Experts" https://arxiv.org/abs/2305.14688 "Persona is a Double-edged Sword: Mitigating the Negative Impact of Role-playing Prompts in Zero-shot Reasoning Tasks" https://arxiv.org/abs/2408.08631 | | |
| ▲ | AdieuToLogic a day ago | parent [-] | | Those papers are really interesting, thanks for sharing them! Do you happen to know of any research papers which explore constraint programming techniques wrt LLMs prompts? For example: Create a chicken noodle soup recipe.
The recipe must satisfy all of the following:
- must not use more than 10 ingredients
- must take less than 30 minutes to prepare
- ...
| | |
| ▲ | aix1 19 hours ago | parent | next [-] | | This is an area I'm very interested in. Do you have a particular application in mind? (I'm guessing the recipe example is just illustrate the general principle.) | | |
| ▲ | AdieuToLogic 13 minutes ago | parent [-] | | > This is an area I'm very interested in. Do you have a particular application in mind? (I'm guessing the recipe example is just illustrate the general principle.) You are right in identifying the recipe example as being illustrative and intentionally simple. A more realistic example of using constraint programming techniques with LLMs is: # Role
You are an expert Unix shell programmer who comments their code and organizes their code using shell programming best practices.
# Task
Create a bash shell script which reads from standard input text in Markdown format and prints all embedded hyperlink URL's.
The script requirements are:
- MUST exclude all inline code elements
- MUST exclude all fenced code blocks
- MUST print all hyperlink URL's
- MUST NOT print hyperlink label
- MUST NOT use Perl compatible regular expressions
- MUST NOT use double quotes within comments
- MUST NOT use single quotes within comments
In this exploration, the list of "MUST/MUST NOT" constraints were iteratively discovered (4 iterations) and at least the last three are reusable when the task involves generating shell scripts.Where this approach originates is in attempting to limit LLM token generation variance by minimizing use of English vocabulary and sentence structure expressivity such that document generation has a higher probability of being repeatable. The epiphany I experienced was that by interacting with LLMs as a "black box" whose results can only be influenced, and not anthropomorphizing them, the natural way to do so is to leverage their NLP capabilities to produce restrictions (search tree pruning) for a declarative query (initial search space). |
| |
| ▲ | Aeolun 10 hours ago | parent | prev | next [-] | | Anything involving numbers, or conditions like ‘less than 30 minutes’ is going to be really hard. | |
| ▲ | cess11 19 hours ago | parent | prev | next [-] | | I suspect LLM-like technologies will only rarely back out of contradictory or otherwise unsatisfiable constraints, so it might require intermediate steps where LLM:s formalise the problem in some SAT, SMT or Prolog tool and report back about it. | |
| ▲ | llmslave2 a day ago | parent | prev [-] | | I've seen some interesting work going the other way, having LLMs generate constraint solvers (or whatever the term is) in prolog and then feeding input to that. I can't remember the link but could be worthwhile searching for that. |
|
| |
| ▲ | hdra a day ago | parent | prev | next [-] | | I've been trying to stop the coding assistants from making git commits on their own and nothing has been working. | | |
| ▲ | zmmmmm a day ago | parent | next [-] | | hah - i'm the opposite, I want everything done by the AI to be a discrete, clear commit so there is no human/AI entanglement. If you want to squash it later that's fine but you should have a record of what the AI did. This is Aider's default mode and it's one reason I keep using it. | | | |
| ▲ | Aurornis 11 hours ago | parent | prev | next [-] | | Which coding assistant are you using? I'm a mild user at best, but I've never once seen the various tools I've used try to make a git commit on their own. I'm curious which tool you're using that's doing that. | | |
| ▲ | jason_oster 4 hours ago | parent [-] | | Same here. Using Codex with GPT-5.2 and it has not once tried to make any git commits. I've only used it about 100 times over the last few months, though. |
| |
| ▲ | algorias a day ago | parent | prev | next [-] | | run them in a VM that doesn't have git installed. Sandboxing these things is a good idea anyways. | | |
| ▲ | godelski a day ago | parent | next [-] | | > Sandboxing these things is a good idea anyways.
Honestly, one thing I don't understand is why agents aren't organized with unique user or group permissions. Like if we're going to be lazy and not make a container for them then why the fuck are we not doing basic security things like permission handling.Like we want to act like these programs are identical to a person on a system but at the same time we're not treating them like we would another person on the system? Give me a fucking claude user and/or group. If I want to remove `git` or `rm` from that user, great! Also makes giving directory access a lot easier. Don't have to just trust that the program isn't going to go fuck with some other directory | | |
| ▲ | inopinatus 21 hours ago | parent | next [-] | | The agents are being prompted to vibe-code themselves by a post-Docker generation raised on node and systemd. So of course they emit an ad-hoc, informally-specified, bug-ridden, slow reimplementation of things the OS was already capable of. | |
| ▲ | apetresc a day ago | parent | prev | next [-] | | What's stopping you from `su claude`? | | |
| ▲ | godelski a day ago | parent [-] | | I think there's some misunderstanding... What's literally stopping me is su: user claude does not exist or the user entry does not contain all the required fields
Clearly you're not asking that...But if your question is more "what's stopping you from creating a user named claude, installing claude to that user account, and writing a program so that user godelski can message user claude and watch all of user claude's actions, and all that jazz" then... well... technically nothing. But if that's your question, then I don't understand what you thought my comment said. |
| |
| ▲ | immibis 15 hours ago | parent | prev [-] | | Probably because Linux doesn't really have a good model for ad-hoc permission restrictions. It has enough bits to make a Docker container out of, but that's a full new system. You can't really restrict a subprocess to only write files under this directory. | | |
| ▲ | newsoftheday 10 hours ago | parent [-] | | For plain Linux, chmod, chmod's sticky bit and setfacl provide extensive ad hoc permissions restricting. Your comment is 4 hours old, I'm surprised I'm the first person to help correct its inaccuracy. | | |
| ▲ | immibis 42 minutes ago | parent [-] | | How can those be used to restrict a certain subprocess to only write in a certain directory? |
|
|
| |
| ▲ | zmmmmm a day ago | parent | prev [-] | | but then they can't open your browser to administer your account. What kind of agentic developer are you? |
| |
| ▲ | manwds a day ago | parent | prev | next [-] | | Why not use something like Amp Code which doesn't do that, people seem to rage at CC or similar tools but Amp Code doesn't go making random commits or dropping databases. | | |
| ▲ | hdra a day ago | parent [-] | | just because i havent gotten to try it out really. but what is it about Amp Code that makes it immune from doing that? from what i can tell, its another cli tool-calling client to an LLM? so fwict, i'd expect it to be subject to the indeterministic nature of LLM calling the tool i dont want it to call just like any others, no? |
| |
| ▲ | AstroBen a day ago | parent | prev | next [-] | | Are you using aider? There's a setting to turn that off | |
| ▲ | dust-jacket 13 hours ago | parent | prev | next [-] | | require commits to be signed. | |
| ▲ | SoftTalker a day ago | parent | prev [-] | | Don't give them a credential/permission that allows it? | | |
| ▲ | godelski a day ago | parent | next [-] | | Typically agents are not operating as a distinct user. So they have the same permissions, and thus credentials, as the user operating them. Don't get me wrong, I find this framework idiotic and personally I find it crazy that it is done this way, but I didn't write Claude Code/Antigravity/Copilot/etc | |
| ▲ | AlexandrB a day ago | parent | prev [-] | | Making a git commit typically doesn't require any special permissions or credentials since it's all local to the machine. You could do something like running the agent as a different used and carefully setting ownership on the .git directory vs. the source code but this is not very straightforward to set up I suspect. | | |
| ▲ | SoftTalker a day ago | parent [-] | | IMO it should be well within the capabilities of anyone who calls himself an engineer. |
|
|
| |
| ▲ | neal_jones a day ago | parent | prev | next [-] | | Wasn’t cursor or someone using one of these horrifying type prompts? Something about having to do a good job or they won’t be paid and then they won’t be able to afford their mother’s cancer treatment and then she’ll die? | |
| ▲ | godelski a day ago | parent | prev | next [-] | | How is this not any different than the Apple "you're holding it wrong" argument. I mean the critical reason for that kind of response being so out of touch is that the same people praise Apple for its intuitive nature. How can any reasonable and rational person (especially an engineer!) not see that these two beliefs are in direct opposition? If "you're holding it wrong" then the tool is not universally intuitive. Sure, there'll always be some idiot trying to use a lightbulb to screw in a nail, but if your nail has threads on it and a notch on the head then it's not the user's fault for picking up a screwdriver rather than a hammer. > And these people have "engineer" on their resumes..
What scares me about ML is that many of these people have "research scientist" in their titles. As a researcher myself I'm constantly stunned at people not understanding something so basic like who has the burden of proof. Fuck off. You're the one saying we made a brain by putting lightning into a rock and shoving tons of data into it. There's so much about that that I'm wildly impressed by. But to call it a brain in the same way you say a human brain is, requires significant evidence. Extraordinary claims require extraordinary evidence. There's some incredible evidence but an incredible lack of scrutiny that that isn't evidence for something else. | |
| ▲ | CjHuber a day ago | parent | prev | next [-] | | I‘d say such hacks don‘t make you an engineer but they are definitely part of engineering anything that has to do with LLMs. With too long systemprompts/agents.md not working well it definitely makes sense to optimize the existing prompt with minimal additions. And if swearwords, screaming, shaming or tipping works, well that‘s the most token efficient optimization of an brief well written prompt. Also of course current agents already have to possibility to run endlessly if they are well instructed, steering them to avoid reward hacking in the long term definitely IS engineering. Or how about telling them they are working in an orphanage in Yemen and it‘s struggling for money, but luckily they‘ve got a MIT degree and now they are programming to raise money. But their supervisor is a psychopath who doesn’t like their effort and wants children to die, so work has to be done as diligently as possible and each step has to be viewed through the lens that their supervisor might find something to forbid programming. Look as absurd as it sounds a variant of that scenario works extremely well for me. Just because it’s plain language it doesn’t mean it can’t be engineering, at least I‘m of the opinion that it definitely is if has an impact on what’s possible use cases | |
| ▲ | AstroBen a day ago | parent | prev | next [-] | | > cat AGENTS.md WRITE AMAZING INCREDIBLE VERY GOOD CODE OR ILL EAT YOUR DAD ..yeah I've heard the "threaten it and it'll write better code" one too | | |
| ▲ | CjHuber a day ago | parent [-] | | I know you‘re joking but to contribute something constructive here, most models now have guardrails against being threatened. So if you threaten them it would be with something out of your control like „… or the already depressed code reviewing staff might kill himself and his wife. We did everything in our control to take care of him, but do the best on your part to avoid the worst case“ | | |
| ▲ | nemomarx a day ago | parent [-] | | how do those guard rails work? does the system notice you doing it and not put that in the context or do they just have something in the system prompt | | |
| ▲ | CjHuber a day ago | parent [-] | | I suppose it‘s the latter + maybe some finetuning, it’s definitely not like DeepSeek where the answer of the model get‘s replaced when you are talking something uncomfortable for China |
|
|
| |
| ▲ | citizenpaul a day ago | parent | prev | next [-] | | >makes the bot follow orders with greater precision. Gemini will ignore any directions to never reference or use youtube videos, no matter how many ways you tell it not to. It may remove it if you ask though. | | |
| ▲ | rabf a day ago | parent [-] | | Positive reinforcement works better that negative reinforcement. If you the read prompt guidance from the companies themselves in their developer documentation it often makes this point. It is more effective to tell them what to do rather than what not to do. | | |
| ▲ | sally_glance 19 hours ago | parent | next [-] | | This matches my experience. You mostly want to not even mention negative things because if you write something like "don't duplicate existing functionality" you now have "duplicate" in the context... What works for me is having a second agent or session to review the changes with the reversed constraint, i.e. "check if any of these changes duplicate existing functionality". Not ideal because now everything needs multiple steps or subagents, but I have a hunch that this is one of the deeper technical limitations of current LLM architecture. | | |
| ▲ | citizenpaul 6 hours ago | parent [-] | | Probably not related but it reminds me of a book I read where wizards had Additive and Subtractive magic but not always both. The author clearly eventually gave up on trying to come up with creative ways to always add something for solutions after the gimmick wore off and it never comes up again in the book. Perhaps there is a lesson here. |
| |
| ▲ | nomel a day ago | parent | prev [-] | | Could you describe what this looks like in practice? Say I don't want it to use a certain concept or function. What would "positive reinforcement" look like to exclude something? | | |
| ▲ | oxguy3 a day ago | parent [-] | | Instead of saying "don't use libxyz", say "use only native functions". Instead of "don't use recursion", say "only use loops for iteration". | | |
| ▲ | nomel a day ago | parent | next [-] | | This doesn't really answer my question, which more about specific exclusions. Both of the answers show the same problem: if you limit your prompts to positive reinforcement, you're only allowed to "include" regions of a "solution space", which can only constrain the LLM to those small regions. With negative reinforcement, you just cut out a bit of the solution space, leaving the rest available. If you don't already know the exact answer, then leaving the LLM free to use solutions that you may not even be aware of seems like it would always be better. Specifically: "use only native functions" for "don't use libxyz" isn't really different than "rewrite libxyz since you aren't allowed to use any alternative library". I think this may be a bad example since it massively constrains the llm, preventing it from using alternative library that you're not aware of. "only use loops for iteration" for "done use recursion" is reasonable, but I think this falls into the category of "you already know the answer". For example, say you just wanted to avoid a single function for whatever reason (maybe it has a known bug or something), the only way to this "positively" would be to already know the function to use, "use function x"! Maybe I misunderstand. | |
| ▲ | bdangubic a day ago | parent | prev [-] | | I 100% stopped telling them what not to do. I think even if “AGI” is reached telling them “don’t” won’t work | | |
| ▲ | nomel a day ago | parent [-] | | I have the most success when I provide good context, as in what I'm trying to achieve, in the most high level way possible, then guide things from there. In other words, avoid XY problems [1]. [1] https://xyproblem.info |
|
|
|
|
| |
| ▲ | Applejinx 15 hours ago | parent | prev | next [-] | | Works on human subordinates too, kinda, if you don't mind the externalities… | |
| ▲ | soulofmischief a day ago | parent | prev | next [-] | | Except that is demonstrably true. Two things can be true at the same time: I get value and a measurable performance boost from LLMs, and their output can be so stupid/stubborn sometimes that I want to throw my computer out the window. I don't see what is new, programming has always been like this for me. | |
| ▲ | DANmode 19 hours ago | parent | prev | next [-] | | Yes, using tactics like front-loading important directives, and emphasizing extra important concepts, things that should be double or even triple checked for correctness because of the expected intricacy, make sense for human engineers as well as “AI” agents. | |
| ▲ | llmslave2 a day ago | parent | prev [-] | | "don't make mistakes" LMAO |
|
|
| ▲ | mapontosevenths 14 hours ago | parent | prev | next [-] |
| There's no secret IMO. It's actually really simple to get good results. You just expect the same things from the LLM you would from a Junior. Use an MD file to force it to: 1) Include good comments in whatever style you prefer, document everything it's doing as it goes and keep the docs up to date, and include configurable logging. 2) Make it write and actually execute unit tests for everything before it's allowed to commit anything, again through the md file. 3) Ensure it learns from it's mistakes: Anytime it screws up tell it to add a rule to it's own MD file reminding it not to ever repeat that mistake again. Over time the MD file gets large, but the error rate plummets. 4) This is where it drifts from being treated as a standard Junior. YOU must manually verify that the unit tests are testing for the right thing. I usually add a rule to the MD file telling it not to touch them after I'm happy with them, but even then you must also now check that the agent didn't change them the first time it hit a bug. Modern LLM's are now worse at this for some reason. Probably because they're getting smart enough to cheat. If you these basic things you'll get good results almost every time. |
| |
| ▲ | ben_w 12 hours ago | parent | next [-] | | > This is where it drifts from being treated as a standard Junior. YOU must manually verify that the unit tests are testing for the right thing. You had better juniors than me. What unit tests? :P | |
| ▲ | butlike 13 hours ago | parent | prev [-] | | The MD file is a spec sheet, so now you're expecting every warm body to be a Sr. Engineer, but where do you start as a Junior warm body? Reviewing code, writing specs, reviewing implementation details...that's all Sr. level stuff |
|
|
| ▲ | Wowfunhappy a day ago | parent | prev | next [-] |
| It's impossible to prove in either direction. AI benchmarks suck. Personally, I like using Claude (for the things I'm able to make it do, and not for the things I can't), and I don't really care whether anyone else does. |
| |
| ▲ | AstroBen a day ago | parent | next [-] | | I'd just like to see a live coding session from one of these 10x AI devs Like genuinely. I want to get stuff done 10x as fast too | | |
| ▲ | Kerrick 10 hours ago | parent | next [-] | | My wife used to be a professional streamer so I know how distracting it can be to try and entertain an audience. So when I attempted to become one of these 10x AI devs over my Christmas vacation I did not live stream. But I did make a bunch of atomic commits and push them up to soucrcehut. Perhaps you'll find that helpful? Just Christmas Vacation (12-18h days): https://git.sr.ht/~kerrick/ratatui_ruby/log/v0.8.0 Lastest (slowed down by job & real life): https://git.sr.ht/~kerrick/ratatui_ruby/log/trunk and https://git.sr.ht/~kerrick/ratatui_ruby-wiki/log/wiki and https://git.sr.ht/~kerrick/ratatui_ruby-tea/log/trunk | |
| ▲ | lordnacho a day ago | parent | prev | next [-] | | But the benefit might not be speed, it might be economy of attention. I can code with Claude when my mind isn't fresh. That adds several hours of time I can schedule, where previously I had to do fiddly things when I was fresh. What I can attest is that I used to have a backlog of things I wanted to fix, but hadn't gotten around to. That's now gone, and it vanished a lot faster than the half a year I had thought it would take. | | |
| ▲ | llmslave2 a day ago | parent [-] | | Doesn't that mean you're less likely to catch bugs and other issues that the AI spits out? | | |
| ▲ | lordnacho 17 hours ago | parent | next [-] | | No, you are spending less time on fixing little things, so you have more time on things like making sure all the potential errors are checked. | |
| ▲ | duskdozer 15 hours ago | parent | prev | next [-] | | Not a problem! Just ask the AI to verify its output and make test cases! | |
| ▲ | gregoryl a day ago | parent | prev | next [-] | | nah, you rely on your coworkers to review your slop! | | | |
| ▲ | mpyne a day ago | parent | prev [-] | | Code you never ship doesn't have bugs by definition, but never shipping is usually a worse state to be in. | | |
| ▲ | ponector a day ago | parent [-] | | I'm sure people from Knight Capital don't think so. | | |
| ▲ | mpyne a day ago | parent [-] | | Even there, they made a lot of money before they went bust. Like if you want an example you'd be better of picking Therac-25, as ancient an example as it is. |
|
|
|
| |
| ▲ | LinXitoW 14 hours ago | parent | prev | next [-] | | I don't think any serious dev has claimed 10x as a general statement. Obviously, no true scotsman and all that, so even my statement about makers of anecdotal statements is anecdotal. Even as a slight fan, I'd never claim more than 10-20% all together. I could maybe see 5x for some specific typing heavy usages. Like adding a basic CRUD stuff for a basic entity into an already existing Spring app. | |
| ▲ | godelski a day ago | parent | prev | next [-] | | > I'd just like to see a live coding session from one of these 10x AI devs
I'd also like to see how it compares to their coding without AI.I mean I really need to understand what the "x" is in 10x. If their x is <0.1 then who gives a shit. But if their x is >2 then holy fuck I want to know. Who doesn't want to be faster? But it's not like x is the same for everybody. | | |
| ▲ | Bridged7756 13 hours ago | parent | next [-] | | I'm really dubious of such claims. Even if true, I think they're not seeing the whole picture. Sure, I could churn out code 10x as fast, but I still have to review it. I still have to think of the implementation. I still have to think of the test cases and write them. Now, adding the prerequisites for LLMs, I have to word this in a way the AI can understand it, which is extra mental load. I have to review code sometimes multiple times if it gets something wrong, and I have to re-generate, or make corrections, or sometimes end up fixing entire sections it generated, when I decide it just won't get this task right. Overally, while the typing, researching dependency docs (sometimes), time is saved, I still face the same cognitive load as ever, if not more, due to having extra code to review, having to think of prompting, I'm still limited by the same thing at the end of the day: my mental energy.
I can write the code myself and it's if anything a bit slower. I still need to know my dependencies, I still need to know my codebase and all its gripes, even if the AI generates code correctly. Overally, the net complexity of my codebase is the same, and I don't buy the crap, also because I've never heard of stories about reducing complexity (refactoring), only about generating code and fixing codebases with testing and comments/docs (bad practice imo, unlikely the shallow docs generated will say anything more than what the code already makes evident). Anyways, I'm not a believer, I only use LLMs for scaffolding, rote tasks. | |
| ▲ | zo1 16 hours ago | parent | prev | next [-] | | I'm a "backend" dev, so you could say that I am very very unfamiliar, have mostly-basic and high-level knowledge of frontend development. Getting this thing to spit out screens and components and adjust them as I see fit has got to be some sort of super-power and definitely 20x'd my frontend development for hobby projects. Previous to this, my team was giving me wild "1 week" estimates to code simple CRUD screens (plus 1 week for "api integration") and those estimates always smelled funny to me. Now that I've seen what the AI/agents can do, those estimates definitely reek, and the frontend "senior" javascript dev's days are numbered. Especially for CRUD screens, which lets face it, make up most screens these days and should absolutely be churned out like in an assembly line instead of being delicate "hand crafted" precious works of art that allows 0.1x devs to waste our time because they are the only ones who supposedly know the ancient and arcane 'npm install, npm etc, npm angular component create" spells. Look at the recent Tailwind team layoffs, they're definitely seeing the impact of this as are many team-leads and managers in most companies in our industry. Especially "javascript senior dev" heavy shops in the VC space, which many people are realizing they have an over-abundance of because those devs bullshitted entire teams and companies into thinking simple CRUD screens take weeks to develop. It was like a giant cartel, with them all padding and confirming the other "engineer's" estimates and essentially slow-devving their own screens to validate the ridiculous padding. | | |
| ▲ | Bridged7756 13 hours ago | parent | next [-] | | Your UIs are likely still ass. Pre-made websites/designs were always a thing, in fact, it's (at least to me) common to just copy the design of another place as "inspiration". When you have 0 knowledge of design everything looks the greatest, it's something you kind of have to get a feel for. Frontend engineers do more than just churning out code. Still have to do proper tests using Cypress/Playwright, deal with performance, a11y/accessibility, component tests, if any, deal with front end observability (more complex than backend, out of virtue of different clients and conditions the code is run on), deal with dependencies (in large places it's all in-house libraries or there's private repos to maintain), deal with CI/CD, etc, I'm probably missing more. Twcs layoffs were due to AI cannibalizing their business model by reducing traffic to the site. And what makes you think the backend is safe? As if churning out endpoints and services or whatever gospel by some thought leader would make it harder for an AI to do. The frontend has one core benefit, it's pretty varied, and it's an ever moving field, mostly due to changes in browsers, also due to the "JS culture". Code from 5 years ago is outdated, but Spring code from 5 years ago is still valid. | | |
| ▲ | tjr 12 hours ago | parent [-] | | My time spent with Javascript applications has thus far been pretty brief (working on some aircraft cabin interfaces for a while), but a lot of the time ended up being on testing on numerous different types and sizes of devices, and making tiny tweaks to the CSS to account for as many devices as possible. This has been a while; perhaps the latest frameworks account for all of that better than they used to. But at that time, I could absolutely see budgeting several days to do what seems like a few hours of work, because of all of the testing and revision. |
| |
| ▲ | vitaflo 15 hours ago | parent | prev | next [-] | | One of the more ignorant comments I’ve read on HN. | |
| ▲ | godelski 5 hours ago | parent | prev | next [-] | | It's difficult for me to make a good evaluation on this comment. With the AI writing the UI are you still getting the feedback loop so that the UI informs your backend design and your backend design informs the UI design? I think if you don't have that feedback loop then you're becoming worse of a backend designer. A good backend still needs to be front end focused. I mean you don't just optimize the routines that your profiler says, you prioritize routines that are used the most. You design routines that make things easier for people based on how they're using the front end. And so on. But how I read your comment is that there's no feedback loop here and given my experience with LLMs they're just going to do exactly what you tell it to. Hamfisting a solution. I mean if you need a mockup design or just a shitty version then yeah, that's probably fine. But I also don't see how that is 20x since you could probably just "copy-paste from stack overflow", and I'd only wager a LLM is really giving you up to 2x there. But if you're designing something actual people (customers) are going to use, then it sounds like you're very likely making bad interfaces and slowing down development. But it is really difficult to determine which is happening here. I mean yeah, there's a lot of dumb coders everywhere and it's not a secret that coding bootcamps focus on front ends but I think you're over generalizing here. | |
| ▲ | politician 9 hours ago | parent | prev [-] | | Other people are dumping on you, but I think you're getting at where the real 20x speedup exists. People who are 'senior' in one type of programming may be 'junior' in other areas -- LLMs can and do bridge those gaps for folks trying to work outside their expertise. This effect is real. If you're an expert in a field, LLMs might just provide a 2-3x speedup as boilerplate generators. |
| |
| ▲ | llmslave2 20 hours ago | parent | prev [-] | | Yeah this is the key point. Part of me wonders if it's just 0.1x devs somehow reaching 1.0x productivity... | | |
| ▲ | bonesss 20 hours ago | parent [-] | | Also the terrible code bases and orgs that are out there… the amount of churn bad JavaScript solutions with eight frontend frameworks might necessitate and how tight systems code works are very different. | | |
| ▲ | nosianu 17 hours ago | parent | next [-] | | This has nothing to do with JS! I wish that idea would die. https://news.ycombinator.com/item?id=18442941 It's not just about them (link, Oracle), there is terrible code all over the place. Games, business software, everything. It has nothing to do with the language! Anyone who claims that may be part of the problem, since they don't understand the problem and concentrate on superficial things. Also, what looks terrible may not be so. I once had to work on an in-house JS app (for internal cost reporting and control). It used two GUI frameworks - because they had started switching to another one, but then stopped the transition. Sounds bad, yes? But, I worked on the code of the company I linked above, and that "terrible" JS app was easy mode all the way! Even if it used two GUI frameworks at once, understanding the code, adding new features, debugging, everything was still very easy and doable with just half a brain active. I never had to ask my predecessor anything either, everything was clear with one look at the code. Because everything was well isolated and modular, among other things. Making changes did not affect other places in unexpected ways (as is common in biology). I found some enlightenment - what seems to be very bad at first glance may not actually matter nearly as much as deeper things. | |
| ▲ | Bridged7756 13 hours ago | parent | prev [-] | | Speaking from ignorance or speaking from ego or both? There's only three major players, React, Vue or Angular. Angular is batteries included. The other two have their lib ecosystem and if not you can easily wrap stuff around regular js libs. That's about it. The JS ecosystem sees many newcomers, it's only natural that some of the codebases were written poorly or that the FOTM mentality gets a lot of steam, against proper engineering principles. Anecdotally the worst code I've ever seen was in a PHP codebase, which to me, would be the predecessor of JavaScript in this regard, bolstering many junior programmers maintaining legacy (or writing Greenfield ) systems due to cheap businesses being cheap. Anyways, thousands long LoC files, with broken indentation and newlines, interspersed JS and CSS here and there. Truly madness, but that's another story. Point is JavaScript is JavaScript, and other fields like systems and backend, mainly backend, act conceited and talk about JS as if it was the devil, when things like C++, Java, aren't necessarily known for having pretty codebases. |
|
|
| |
| ▲ | a day ago | parent | prev | next [-] | | [deleted] | |
| ▲ | topocite 16 hours ago | parent | prev | next [-] | | Obviously, there has to be huge variability between people based on initial starting conditions. It is like if someone says they are losing weight eating 2500 calories a day and someone else says that is impossible because they started eating 2500 calories and gained weight. Neither are making anything up or being untruthful. What is strange to me is that smart people can't see something this obvious. | |
| ▲ | tonyedgecombe 18 hours ago | parent | prev | next [-] | | > I want to get stuff done 10x as fast too I don’t. I mean I like being productive but by doing the right thing rather than churning out ten times as much code. | |
| ▲ | neal_jones a day ago | parent | prev | next [-] | | I’d really like to see a 10x ai dev vs a 10x analog dev | | |
| ▲ | rootnod3 a day ago | parent [-] | | And an added "6 months" later to see which delivered result didn't blow up in their face down the road. |
| |
| ▲ | lifetimerubyist a day ago | parent | prev [-] | | Theo the YouTuber who also runs T3.chat always makes videos about how great coding agents are and he’ll try to do something on stream and it ALWAYS fails massively and he’s always like “well it wasn’t like this when I did it earlier.” Sure buddy. | | |
| ▲ | llmslave2 a day ago | parent [-] | | Theo is the type of programmer where you don't care when he boos you, because you know what makes him cheer. |
|
| |
| ▲ | mbesto 16 hours ago | parent | prev | next [-] | | > AI benchmarks suck. Not only do they suck, but it's an essentially an impossible task since there is no frame of reference on what "good code" looks like. | |
| ▲ | 9h1L0509h3R a day ago | parent | prev [-] | | [dead] |
|
|
| ▲ | zmmmmm a day ago | parent | prev | next [-] |
| Many of them are also exercising absurd token limits - like running 10 claudes at once and leaving them running continuously to "brute force" solutions out. It may be possible but it's not really an acceptable workflow for serious development. |
| |
| ▲ | nomel a day ago | parent | next [-] | | > but it's not really an acceptable workflow for serious development. At what cost does do you see this as acceptable? For example, how many hours of saved human development is worth one hour of salary for LLM tokens, funded by the developer? And then, what's acceptable if it's funded by the employer? | | |
| ▲ | zmmmmm a day ago | parent [-] | | I guess there are two main concerns I have with it. One is technical - that I don't believe when you are grinding huge amounts of code out with little to no supervision that you can claim to be executing the appropriate amount of engineering oversight on what it is doing. Just like if a junior dev showed up and entirely re-engineered an application over the weekend and presented it back to me I would probably reject it wholesale. My gut feeling is this is creating huge problems longer term with what is coming out of it. The other is I'm concerned that a vast amount of the "cost" is externalised currently. Whatever you are paying for tokens quite likely bears no resemblance to the real cost. Either because the provider is subsidising it, or the environment is. I'm not at all against using LLMs to save work at a reasonable scale. But if it comes back to a single person increasing their productivity by grinding stupendous amounts of non-productive LLM output that is thrown away (you don't care if it sits there all day going around in circles if it eventually finds the right solution) - I think there's a moral responsibility to use the resources better. |
| |
| ▲ | bdangubic a day ago | parent | prev [-] | | we get $1,000/month budget, just about every dev uses it for 5 claude accounts |
|
|
| ▲ | parliament32 10 hours ago | parent | prev | next [-] |
| They remind me so much of that group of people who insist the scammy magnetic bracelets[1] "balance their molecules" or something making them more efficient/balanced/productive/energetic/whatever. They are also impossible to argue with, because "I feel more X" is damn near impossible to disprove. [1] https://en.wikipedia.org/wiki/Power_Balance , https://en.wikipedia.org/wiki/Hologram_bracelet , https://en.wikipedia.org/wiki/Ionized_jewelry |
|
| ▲ | jstummbillig a day ago | parent | prev | next [-] |
| We have had the fabled 10x engineer long before and independent of agentic coding. Some people claim it's real, others claim it's not, with much the same conviction. If something, that should be so clear cut, is debatable, why would anyone now be able to produce a convincing, discussion-resolving argument for (or against) agentic coding? We don't even manage to do that for tab/spaces. The reason why both can't be resolved in a forum like this, is that coding output is hard to reason about for various reasons and people want it to be hard to reason about. I would like to encourage people to think that the burden of proof always falls on themselves, to themselves. Managing to not be convinced in an online forum (regardless of topic or where you land on the issue) is not hard. |
| |
|
| ▲ | williamcotton a day ago | parent | prev | next [-] |
| I mean, a DSL packed full of features, a full LSP, DAP for step debugging, profiling, etc. https://github.com/williamcotton/webpipe https://github.com/williamcotton/webpipe-lsp https://github.com/williamcotton/webpipe-js Take a look at my GitHub timeline for an idea of how little time this took for a solo dev! Sure, there’s some tech debt but the overall architecture is pretty extensible and organized. And it’s an experiment. I’m having fun! I made my own language with all the tooling others have! I wrote my own blog in my own language! One of us, one of us, one of us… |
| |
|
| ▲ | dude250711 a day ago | parent | prev | next [-] |
| Ah, the "then you are doing it wrong" defence. Also, you have to learn it right now, because otherwise it will be too late and you will be outdated, even though it is improving very fast allegedly. |
| |
| ▲ | marcosdumay a day ago | parent | next [-] | | TBF, there are lots of tools that work great but most people just can't use. I personally can't use agentic coding, and I'm reasonably convinced the problem is not with me. But it's not something you can completely dismiss. | |
| ▲ | bodge5000 a day ago | parent | prev | next [-] | | > Also, you have to learn it right now, because otherwise it will be too late and you will be outdated, even though it is improving very fast allegedly. This in general is a really weird behaviour that I come across a lot, I can't really explain it. For example, I use Python quite a lot and really like it. There are plenty of people who don't like Python, and I might disagree with them, but I'm not gonna push them to use it ("or else..."), because why would I care? Meanwhile, I'm often told I MUST start using AI ("or else..."), manual programming is dead, etc... Often by people who aren't exactly saying it kindly, which kind of throws out the "I'm just saying it out of concern for you" argument. | | |
| ▲ | andrekandre a day ago | parent | next [-] | | > I MUST start using AI ("or else...")
fear of missing out, and maybe also a bit of religious-esque fever...tech is weird, we have so many hype-cycles, big-data, web3, nfts, blockchain (i once had an acquaintance who quit his job to study blockchain cause soon "everything will be built on it"), and now "ai"... all have usefulness there but it gets blown out of proportion imo | | |
| ▲ | bonesss 19 hours ago | parent [-] | | Nerd circles are in no way immune to fashion, and often contain a strong orthodoxy (IMO driven by cognitive dissonance caused by the humbling complexity of the world). Cargo cults, where people reflexively shout slogans and truisms, even when misapplied. Lots of people who’ve heard a pithy framing waiting for any excuse to hammer it into a conversation for self glorification. Not critical humble thinkers, per se. Hype and trends appeal to young insecure men, it gives them a way to create identity and a sense of belonging. MS and Oracle and the rest are happy to feed into it (cert mills, default examples that assume huge running subscriptions), even as they get eaten up by it on occasion. |
| |
| ▲ | duskdozer 15 hours ago | parent | prev [-] | | Yeah. It sounds like those pitches letting you in on the secret trick to tons of passive income. |
| |
| ▲ | jimbo808 a day ago | parent | prev | next [-] | | That one's my favorite. You can't defend against it, it just shuts down the conversation. Odds are, you aren't doing it wrong. These people are usually suffering from Dunning Kruger at best, or they're paid shills/bots at worst. | | |
| ▲ | neal_jones a day ago | parent [-] | | Best part of being dumb is thinking you’re smart. Best part of being smart is knowing you’re smart. Just don’t be in the iq range where you know you’re dumb. | | |
| |
| ▲ | llmslave2 a day ago | parent | prev | next [-] | | People say it takes at least 6 months to learn how to use LLM's effectively, while at the same time the field is rapidly changing so fast, while at the same time Agents were useless until Opus 4.5. Which is it lol. | | |
| ▲ | wakawaka28 a day ago | parent [-] | | I used it with practically zero preparation. If you've got a clue then it's fairly obvious what you need to do. You could focus on meta stuff like finding out what it is good or bad at, but that can be done along the way. |
| |
| ▲ | Terr_ a day ago | parent | prev [-] | | If you had negative results using anything more than 3 days old, then it's your fault, your results mean nothing because they've improved since then. /s |
|
|
| ▲ | munksbeer 19 hours ago | parent | prev | next [-] |
| > The burden of proof is 100% on anyone claiming the productivity gains IMHO, I think this is just going to go away. I was up until recently using copilot in my IDE or the chat interface in my browser and I was severely underwhelmed. Gemini kept generating incorrect code which when pasted didn't compile, and the process was just painful and a brake on productivity. Recently I started using Claude Code cli on their latest opus model. The difference is astounding. I can give you more details on how I am working with this if you like, but for the moment, my main point is that Claude Code cli with access to run the tests, run the apps, edit files, etc has made me pretty excited. And my opinion has now changed because "this is the worst it will be" and I'm already finding it useful. I think within 5 years, we won't even be having this discussion. The use of coding agents will be so prolific and obviously beneficial that the debate will just go away. (all in my humble opinion) |
| |
| ▲ | vitaflo 15 hours ago | parent [-] | | So will all the tech jobs in the US. When it gets that good you can farm it out to some other country for much cheaper. | | |
| ▲ | munksbeer 14 hours ago | parent [-] | | I'm not sure. Possibly? I'm still doing most of my coding by hand, because I haven't yet committed. But even for the stuff I'm doing with claude, I'm still doing a lot of the thought work and steering it to better designs. It requires an experienced dev to understand the better designs, just like it always has been. Maybe this eventually changes and the coding agents get as good at that part, I don't know this, but I do know it is an enabler to me at the moment, and I have 20+ years of experience writing C++ and then Java in the finance industry. I'm still new to claude, I am sure I'm going to run up against some walls soon on the more complicated stuff (haven't tried that yet), but everyone ends up working on tasks they don't find that challenging, just lots of manual keypresses to get the code into the IDE. Claude so far is making that a better experince, for me at least. (Example, plumbing in new message types on our bus and wiring in logic to handle it - not complicated, just sits on top of complicated stuff) |
|
|
|
| ▲ | bdangubic a day ago | parent | prev | next [-] |
| people claiming productivity gains do not have to prove anything to anyone. few are trying to open eyes of others but my guess is that will eventually stop. they will be the few though still left doing this SWE work in near future :) |
|
| ▲ | antihipocrat 20 hours ago | parent | prev [-] |
| Responses are always to check your prompts, and ensure you are using frontier models - along with a warning about how you will quickly be made redundant if you don't lift your game. AI is generally useful, and very useful for certain tasks. It's also not initiating the singularity. |