| |
| ▲ | GolDDranks 21 hours ago | parent | next [-] | | Two problems: 1) Writing a high-performance memory allocator for a game engine in Rust: https://github.com/golddranks/bang/tree/main/libs/arena/src (Still work in progress, so it's in a bit messy state.) Didn't seem to understand the design I had in mind, and/or the requirements and goes on tangents and starts changing the design. In the end, coded the main code myself and used LLM for writing tests with some success. Had to remove tons of inane comments that didn't provide any explanatory value. 2) Trying to fix a Django ORM expression that generates unoptimal and incorrect SQL. Constantly changes opinion whether something is even possible or supported by Django, apologizes when pointing out mistakes / bugs / hallucinations, but then proceeds to not internalize the implications of the said mistakes. I used the Zed editor with its recently published agentic features. I tried to prompt it with a chat style discussion, but it often did bigger edits I would have liked, and failed to share a high-level plan in advance, something I often requested. My biggest frustrations were not coding problems per se, but just general inability to follow instructions and see implications, and lacking the awareness to step back and ask for confirmations or better directions if there are "hold on, something's not right" kind of moments. Also, generally following through with "thanks for pointing that out, you are absolutely right!" even if you are NOT right. That yes-man style seriously erodes trust in the output. | | |
| ▲ | BeetleB 20 hours ago | parent | next [-] | | Thanks for the concrete examples! These seem to be more sophisticated than the cases I use them for. Mostly I'm using them for tedious, simpler routine code (needing to process all files in a directory in a certain way, output them in a similar tree structure, with changes to filenames, logging, etc). Your Django ORM may be more complicated than the ones I use. I haven't tried it much with Django (still reluctant to use it with production code), but a coworker did use it on our code base and it found good optimizations for some of our inefficient ORM usage. He learned new Django features as a result (new to him, that is). > I tried to prompt it with a chat style discussion, but it often did bigger edits I would have liked, and failed to share a high-level plan in advance, something I often requested. With Aider, I often use /ask to do a pure chat (no agents). It gives me a big picture overview and the code changes. If I like it, I simply say "Go ahead". Or I refine with corrections, and when it gets it right, I say "Go ahead". So far it rarely has changed code beyond what I want, and the few times it did turned out to be a good idea. Also, with Aider, you can limit the context to a fixed set of files. That doesn't solve it changing other things in the file - but as I said, rarely a problem for me. One thing to keep in mind - it's better to view the LLM not as an extension of yourself, but more like a coworker who is making changes that you're reviewing. If you have a certain vision/design in mind, don't expect it to follow it all the way to low level details - just as a coworker will sometimes deviate. > My biggest frustrations were not coding problems per se, but just general inability to follow instructions and see implications, and lacking the awareness to step back and ask for confirmations or better directions if there are "hold on, somethings not right" kind of moments. You have to explicitly tell it to ask questions (and some models ask great questions - not sure about Sonnet 3.7). Read this page: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/ I don't follow much of what's on his post, but the first part where you specify what you want it to do and have it ask you questions has always been useful! He's talking about big changes, but I sometimes have it ask me for minor changes. I just add to my prompts "Ask me something if it seems ambiguous". | | |
| ▲ | GolDDranks 18 hours ago | parent [-] | | Re: Prompting to ask. Thanks, I'll try that. And I'm gonna try version 4 as soon as I can. | | |
| ▲ | drcongo 14 hours ago | parent [-] | | I've been using Claude 3.7 in Zed for a while, and I've found that I've been getting better at asking it to do things (including a lot of Django ORM stuff). Each project I work on I now have a `.context.md` that gives a decent outline of the project, and also includes things I specifically don't want it to do, like create migrations or install packages. Then with the actual prompting, I tend to ask it to plan things first, and to stop and ask me if it thinks I've missed out any detail. I've been pretty impressed with how right it gets things with this setup. And tiny tip, just in case you've never noticed it, there's a little + button just above the prompt input in Zed that lets you add files you want added to the context - this is where I add the `.context.md` whenever I start work on something. |
|
| |
| ▲ | jjmarr 15 hours ago | parent | prev [-] | | Try Roo Code in Orchestrator mode or Cline in plan mode. It will do tons of requirements analysis before starting work. |
| |
| ▲ | sponnath a day ago | parent | prev | next [-] | | I feel like the opposite is true but maybe the issue is that we both live in separate bubbles. Often times I see people on X and elsewhere making wild claims about the capabilities of AI and rarely do they link to the actual output. That said, I agree that AI has been amazing for fairly closed ended problems like writing a basic script or even writing scaffolding for tests (it's about 90% effective at producing tests I'd consider good assuming you give it enough context). Greenfield projects have been more of a miss than a hit for me. It starts out well but if you don't do a good job of directing architecture it can go off the rails pretty quickly. In a lot of cases I find it faster to write the code myself. | | |
| ▲ | osn9363739 a day ago | parent [-] | | I'm in the same bubble. I find if they do link to it it's some basic unimpressive demo app. That said, I want to see a video where of one of these people that apparently 10x'd there programming go against a dev without AI across various scenarios. I just think it would be interesting to watch if they had a similar base skill & understanding of things. | | |
| ▲ | BeetleB a day ago | parent [-] | | > That said, I want to see a video where of one of these people that apparently 10x'd there programming go against a dev without AI across various scenarios. It would be interesting, but do understand that if AI coding is totally fantastic in one domain (basic automation scripting) and totally crappy in another (existing, complex codebase), it's still a (significant) improvement from the pre-AI days. Concrete example: A few days ago I had an AI model write me a basic MCP tool: Creating a Jira story. In 15 minutes, it had written the API function for me, I manually wrapped it to make it an MCP tool, tested it, and then created tens of stories from a predefined list, and verified it worked. Now if you already know the Jira APIs (endpoints, auth, etc), you could do it with similar speed. But I didn't. Just finding the docs, etc would take me longer. Code quality is fine. This is not production code. It's just for me. Yes, there are other Jira MCP libraries already. It was quicker for me to write my own than to figure out the existing ones (ditto for Github MCP). When using LLMs to solve a coding problem is faster than using Google/SO/official docs/existing libraries, that's clearly a win. Would I do it this way for production code? No. Does that mean it's bad? No. |
|
| |
| ▲ | fumeux_fume a day ago | parent | prev | next [-] | | Aside from the fact that you seem to be demanding a lot from someone who's informally sharing their experience online, I think the effectiveness really depends on what you're writing code for. With straightforward use cases that have ample documented examples, you can generally expect decent or even excellent results. However, the more novel the task or the more esoteric the software library, the likelier you are to encounter issues and feel dissatisfied with the outcomes. Additionally, some people are simply pickier about code quality and won't accept suboptimal results. Where I work, I regularly encounter wildly enthusiastic opinions about GenAI that lack any supporting evidence. Dissenting from the mainstream belief that AI is transforming every industry is treated as heresy, so such skepticism is best kept close to the chest—or better yet, completely to oneself. | | |
| ▲ | BeetleB a day ago | parent [-] | | > Aside from the fact that you seem to be demanding a lot from someone who's informally sharing their experience online Looking at isolated comments, you are right. My point was that it was a trend. I don't expect everyone to go into details, but I notice almost none do. Even what you pointed out ("great for somethings, crappy for others") has much higher entropy. Consider this, if every C++ related submission had comments that said the equivalent of "After using C++ for a few weeks, my verdict is that its performance capabilities are unimpressive", and then didn't go into any details about what made them think that, I think you'd find my analogous criticism about such comments fair. |
| |
| ▲ | raincole a day ago | parent | prev | next [-] | | Yeah, that's obvious. It's even worse for blog posts. Pro-LLM posts usually come with the whole working toy apps and the prompts that were used to generate them. Anti-LLM posts are usually some logical puzzles with twists. Anyway that's the Internet for you. People will say LLM has been plateaued since 2022 with a straight face. | |
| ▲ | csomar 19 hours ago | parent | prev | next [-] | | Maybe you are not reading what we are writing :) Here is an article of mine https://omarabid.com/gpt3-now > But for certain use cases (e.g. simple script, written from scratch), it's absolutely fantastic. I agree with that. I've found it to be very useful for "yarn run xxx" scripts. Can automate lots of tasks that I wouldn't bother with previously because the cost of coding the automation vs. doing them manually was off. | | | |
| ▲ | whimsicalism 16 hours ago | parent | prev | next [-] | | i think these developments produce job/economic anxiety and so a certain percentage of people react this way, even higher percents on reddit where there is a lot of job anxiety | |
| ▲ | jiggawatts a day ago | parent | prev [-] | | Reminds me of the early days of endless “ChatGPT can’t do X” comments where they were invariably using 3.5 Turbo instead of 4, which was available to paying users only. Humans are much lazier than AIs was my takeaway lesson from that. |
|