| ▲ | aurareturn 8 hours ago |
| These are the perfect size projects vibe coding is currently good for.
So far... it's going to keep getting better to the point until all software is written this way. |
|
| ▲ | HarHarVeryFunny 7 hours ago | parent | next [-] |
| Sure, but that's basically the same as saying that we'll have human-equivalent AI one day (let's not call it AGI, since that means something different to everyone that uses it), and then everything that humans can do could then be done by AI (whether or not it will be, is another question). So, yes, ONE DAY, AI will be doing all sorts of things (from POTUS and CEO on down), once it is capable of on-the-job learning and picking up new skills, and everything else that isn't just language model + agent + RAG. It the meantime, the core competence of an LLM is blinkers-on (context-on) executing - coding - according to tasks (part of some plan) assigned to it by a human who, just like a lead assigning tasks to human team members, is aware of what it can and can not do, and is capable of overseeing the project. |
|
| ▲ | __MatrixMan__ 7 hours ago | parent | prev | next [-] |
| It seems like it's approaching a horizontal asymptote to me, or is at the very least concave down. You might be describing a state 50 years from now. |
| |
| ▲ | aurareturn 5 hours ago | parent | next [-] | | It seems like progress is accelerating, not slowing down. ARC AGI 2: https://x.com/poetiq_ai/status/2003546910427361402 METR: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com... | | |
| ▲ | __MatrixMan__ 4 hours ago | parent [-] | | Improved benchmarks are undeniably an improvement, but the bottleneck isn't the models anymore, it's the context engineering necessary to harness them. The more time and effort we put into our benchmarking systems the better we're able to differentiate between models, but then when you take an allegedly smart one and try to do something real with it, it behaves like a dumb one again because you haven't put as much work into the harness for the actual task you've asked it to do as you did into the benchmark suite. The knowledge necessary to do real work with these things is still mostly locked up in the humans that have traditionally done that work. |
| |
| ▲ | anthonypasq 7 hours ago | parent | prev [-] | | sonnet 3.7 was released 10 months ago! (the first model truly capable of any sort of reasonable agentic coding at all) and opus 4.5 exists today. | | |
| ▲ | rabf 6 hours ago | parent [-] | | To add to this: the tooling or `harness` around the models has vastly improved as well. You can get far better results with older or smaller models today than you could 10 months ago. |
|
|
|
| ▲ | croes 7 hours ago | parent | prev | next [-] |
| Successfully building an IKEA shelf doesn’t make you a carpenter. |
| |
| ▲ | exe34 6 hours ago | parent [-] | | no, but I have furniture. it's important to keep sight of the end goal, unless the carpentry is purely a hobby. | | |
| ▲ | whattheheckheck 5 hours ago | parent [-] | | What's the job title and education requirements for designing the supply chain and engineering of the ikea furniture? | | |
|
|
|
| ▲ | rvz 7 hours ago | parent | prev | next [-] |
| Air traffic control software is not going to be vibe-coded anytime soon and neither is the firmware controlling the plane. |
| |
| ▲ | aurareturn 5 hours ago | parent | next [-] | | Sure it will. But they will be tested far more stringently by both human experts and the smartest LLM models. | |
| ▲ | A4ET8a8uTh0_v2 7 hours ago | parent | prev | next [-] | | I will be perfectly honest. Given what I am seeing, I fully expect someone to actually try just that. | |
| ▲ | blks 3 hours ago | parent | prev [-] | | Considering how much work at Boeing is given to consultants and other third party contractors (eg famous MCAS), some piece of work after moving through the bowls of multiple subcontractors will end up in the hands of a under-qualified developer who will ask his favourite slop machine to generate code he doesn’t exactly understands purpose of. |
|
|
| ▲ | spzb 7 hours ago | parent | prev [-] |
| I've got a bridge to sell you |