| ▲ | FeepingCreature 2 days ago | ||||||||||||||||||||||||||||
> How are you qualified to judge its performance on real code if you don't know how to write a hello world? The ultimate test of all software is "run it and see if it's useful for you." You do not need to be a programmer at all to be qualified to test this. | |||||||||||||||||||||||||||||
| ▲ | LucaMo 2 days ago | parent [-] | ||||||||||||||||||||||||||||
What I think people get wrong (especially non-coders) is that they believe the limitation of LLMs is to build a complex algorithm. That issue in reality was fixed a long time ago. The real issue is to build a product. Think about microservices in different projects, using APIs that are not perfectly documented or whose documentation is massive, etc. Honestly I don't know what commenters on hackernews are building, but a few months back I was hoping to use AI to build the interaction layer with Stripe to handle multiple products and delayed cancellations via subscription schedules. Everything is documented, the documentation is a bit scattered across pages, but the information is out there. At the time there was Opus 4.1, so I used that. It wrote 1000 lines of non-functional code with 0 reusability after several prompts. I then asked something to Chat gpt to see if it was possible without using schedules, it told me yes (even if there is not) and when I told Claude to recode it, it started coding random stuff that doesn't exist. I built everything to be functional and reusable myself, in approximately 300 lines of code. The above is a software engineering problem. Reimplementing a JSON parser using Opus is not fun nor useful, so that should not be used as a metric | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||