It’s also pretty wild to me how people still don’t really even know how to use it.

On hacker news, a very tech literate place, I see people thinking modern AI models can’t generate working code.

The other day in real life I was talking to a friend of mine about ChatGPT. They didn’t know you needed to turn on “thinking” to get higher quality results. This is a technical person who has worked at Amazon.

You can’t expect revolutionary impact while people are still learning how to even use the thing. We’re so early.

▲

overgard 6 hours ago | parent | next [-]

I don't think "results don't match promises" is the same as "not knowing how to use it". I've been using Claude and OpenAI's latest models for the past two weeks now (probably moving at about 1000 lines of code a day, which is what I can comfortably review), and it makes subtle hard-to-find mistakes all over the place. Or it just misunderstands well known design patterns, or does something bone headed. I'm fine with this! But that's because I'm asking it to write code that I could write myself, and I'm actually reading it. This whole "it can build a whole company for me and I don't even look at it!" is overhype.

▲

XenophileJKO an hour ago | parent | next [-]

If you know good architecture and you are testing as you go, I would say, it is probably pretty damn close to being able to build a company without looking at the code. Not without "risk" but definitely doable and plausible.

My current project that I started this weekend is a rust client server game with the client compiled into web assembly.

I do these projects without reading the code at all as a way to gauge what I can possibly do with AI without reading code, purely operating as a PM with technical intuition and architectural opinions.

So far Opus 4.6 has been capable of building it all out. I have to catch issues and I have asked it for refactoring analysis to see if it could optimize the file structure/components, but I haven't read the code at all.

At work I certainly read all the code. But would recommend people try to build something non trivial without looking at the code. It does take skill though, so maybe start small and build up the intuition on how they have issues, etc. I think you'll be surprised how much your technical intuition can scale even when you are not looking at the code.

	▲	LunaSea 21 minutes ago \| parent [-]
		Security auditor and criminals have a bright future ahead of them.

▲

scoopdewoop 5 hours ago | parent | prev | next [-]

Prompting LLMs for code simply takes more than a couple of weeks to learn.

It takes time to get an intuition for the kinds of problems they've seen in pre-training, what environments it faced in RL, and what kind of bizarre biases and blindspots it has. Learning to google was hard, learning to use other peoples libraries was hard, and its on par with those skills at least.

If there is a well known design pattern you know, thats a great thing to shout out. Knowing what to add to the context takes time and taste. If you are asking for pieces so large that you can't trust them, ask for smaller pieces and their composition. Its a force multiplier, and your taste for abstractions as a programmer is one of the factors.

In early usenet/forum days, the XY problem described users asking for implementation details of their X solution to Y problem, rather than asking how to solve Y. In llm prompting, people fall into the opposite. They have an X implementation they want to see, and rather than ask for it, they describe the Y problem and expect the LLM to arrive at the same X solution. Just ask for the implementation you want.

Asking bots to ask bots seems to be another skill as well.

▲

vidarh 3 hours ago | parent | prev [-]

Do you use an agent harness to have it review code for you before you do?

If not, you don't know how to use it efficiently.

A large part of using AI efficiently is to significantly lower that review burden by having it do far more of the verification and cleanup itself before you even look at it.

▲

politelemon 6 hours ago | parent | prev | next [-]

You are assuming that we all work on the same tasks and should have exactly the same experience with it, which is it course far from the truth. It's probably best to start with that base assumption and work on the implications from there.

As for the last example, for all the money being spent on this area, if someone is expected to perform a workflow based on the kind of question they're supposed to ask, that's a failure in the packaging and discoverability aspect of the product, the leaky abstraction only helps some of us who know why it's there.

▲

harrall 6 hours ago | parent | prev | next [-]

I’ve been helping normal people at work use AI and there’s two groups that are really struggling:

1. People who only think of using AI in very specific scenarios. They don’t know when you use it outside of the obvious “to write code” situations and they don’t really use AI effectively and get deflated when AI outputs the occasional garbage. They think “isn’t AI supposed to be good at writing code?”

2. People who let AI do all the thinking. Sometimes they’ll use AI to do everything and you have to tell them to throw it all away because it makes no sense. These people also tend to dump analyses straight from AI into Slack because they lack the tools to verify if a given analysis is correct.

To be honest, I help them by teaching them fairly rigid workflows like “you can use AI if you are in this specific situation.” I think most people will only pick up tools effectively if there is a clear template. It’s basically on-the-job training.

▲

mrtksn 6 hours ago | parent | prev | next [-]

In a WhatsApp group full of doctors, managers, journalist and engineers (including software) in age of 30-60 I asked if anyone heard of openclaw and only 3 people heard of it from influencers, none used it.

But from my social feed the impression was that it is taking over the world:)

I asked it because I am building something similar since some tome and I thought its over they were faster than me but as it appears there’s no real adoption yet. Maybe there will be some once they release it as part of ChatGPT but even then it looks like too early as actually few people are using the more advanced tools.

It’s definitely in very early stage. It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop.

▲

wiseowise 5 hours ago | parent | next [-]

> I asked if anyone heard of twitter vaporware and only 3 people heard of it from influencers, none used it.

Shocking results, I say!

▲

KellyCriterion 5 hours ago | parent [-]

No, these people ("managers, engineers" etc.) do just not work in tech & IT but in other fields and they do not read tech news in your country etc.

Most people are just "not that deep in there" as most people on HN.

▲

stackbutterflow an hour ago | parent | next [-]

I spend between 1 and 2h a day on hn and I barely know what openclaw is. I've seen it mentioned once or twice and checked their website but that's all.

If one lets AI FOMO since the release of chatgpt drive them they'd be glued to their screen 24/7.

▲

wiseowise 3 hours ago | parent | prev [-]

> “Tech news”

A guy attached Claude to his socials, groundbreaking tech.

	▲	KellyCriterion an hour ago \| parent [-]
		Once I was working for a consulting & development company; they were trying to enter sector ABC by stuffing up a team of people, so I was told, who had interest in sector ABC stuff and want to do some projects there. While they were deep in software development in general, no body of them read any of the essential/required daily industrial news (also not that one related to doing software development in sector ABC) :-) So no, even people somehow attached to a topic are not necessarily somehow deeper involved.

▲

alephnerd 6 hours ago | parent | prev [-]

> I asked it because I am building something similar since some tome and I thought its over they were faster than me

If you have been working on a usecase similar to OpenClaw for sometime now I'd actually say you are in a great position to start raising now.

Being first to market is not a significant moat in most cases. Few people want to invest in the first company in a category - it's too risky. If there are a couple of other early players then the risk profile has been reduced.

That said, you NEED to concentrate on GTM - technology is commodified, distribution is not.

> It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop

The growth of AI slop has been exponential, but the application of agents for domain specific usecases has been decently successful.

The biggest reason you don't hear about it on HN is because domain-specific applications are not well known on HN, and most enterprises are not publicizing the fact that they are using these tools internally.

Furthermore, almost anyone who is shipping something with actual enterprise usage is under fairly onerous NDAs right now and every company has someone monitoring HN like a hawk.

▲

mrtksn 5 hours ago | parent | next [-]

Do you think that it is a good idea to release it first on iOS, announce on HN and Producthunt? How would you do?

On my app the tech is based on running agent generated code on JavaScriptCore to do things like OpenClaw, I’m wrapping the JS engine with the missing functionality like networking, file access and database access so I believe I will not have a problem with releasing it on Apple AppStore as I use their native stack. Then since this stack is also OS, I’m making a version that will run on Linux, the idea being users develops their solution on their device(iOS&Mac currently) see it working and and then deploys on a server with a tap of a button, so it keeps running.

▲

alephnerd 5 hours ago | parent [-]

Who's your persona? How are you pricing and packaging? Who is your buyer? Are you D2C? Consumer? Replacing EAs? Replacing Project Managers? ...

You need to answer these questions in order to decide whether a Show HN makes sense versus a much more targeted launch.

If you do not know how to answer these questions you need to find a cofounder asap. Technology is commodified. GTM, sales, and packaging is what turns technology into products. Building and selling and fundraising as 1 person is a one-way ticket to burnout, which only makes you and your product less attractive.

I also highly recommend chatting with your network to understand common types of problems. Once you've identified a couple classes of problems and personas for whom your story resonates, then you can decide what approach to take.

Best of luck!

▲

mrtksn 5 hours ago | parent [-]

The persona is, someone who knows what are they doing but need someone to actually automate their work routine. I.e. maybe it’s a crypto trader that makes decisions on signals interpretation so they can create a trading bot that executes on their method. Maybe its a compliance who needs automate some routine like checking details further when some conditions arise. Or maybe a social media manager that needs to moderate their channels.Maybe someone who needs a tool for monitoring HN that specific way?

Thanks for the advice! I’m at a stage where I want to have such tool and see who else wants it. Not sure yet about it’s viability as a business and what is the exact market. Maybe I will find out by putting it into the wild and that’s why I consider to release it as a mobile app first.

	▲	wongarsu an hour ago \| parent [-]
		That persona still sounds too generic, too unfocused. But even with that persona, it should already answer your question whether posting on HN and producthunt should be a core part of your strategy. Not a lot of social media managers or compliance people around here. And even for crypto traders there are better places to pitch products to them

▲

walterbell 5 hours ago | parent | prev [-]

> every company has someone monitoring HN like a hawk.

Monitoring specific user accounts or keywords? Is this typically done by a social media reputation management service?

▲

bigbuppo 5 hours ago | parent | prev | next [-]

And it will get worse once the UX people get ahold of it.

▲

scrubs 5 hours ago | parent [-]

You got that right . .. imagine AI making more keyboard shortcuts, "helping" wayland move off X more so, new window transistions, overhauling htmx ... it'll be hell+ on earth.

	▲	alternatex 4 hours ago \| parent [-]
		We can indeed only imagine. For now, AI has been a curse for open source projects.

▲

KellyCriterion 5 hours ago | parent | prev | next [-]

A neighbour of me has a PhD and is working in research at a hospital. He is super smart.

Last time he said: "yes yes I know about ChatGPT, but I do not use it at work or home."

Therefore, most people wont even know about Gemini, Grok or even Claude.

▲

tstrimple 6 hours ago | parent | prev | next [-]

> On hacker news, a very tech literate place

I think this is the prior you should investigate. That may be what HN used to be. But it's been a long time since it has been an active reality. You can still see actual expert opinions on HN, but they are the minority more and more.

	▲	alephnerd 5 hours ago \| parent [-]
		I think one longtime HN user (Karrot_Kream I think) pinpointed the change in HN discourse to sometime in mid 2022 to early 2023 when the rate of new users spiked to 40k per month and remained at that elevated rate. From personal experience, I've also noticed that some of the most toxic discourse and responses I've received on this platform are overwhelmingly from post-2022 users.

▲

slopinthebag 6 hours ago | parent | prev [-]

> I see people thinking modern AI models can’t generate working code.

Really? Can you show any examples of someone claiming AI models cannot generate working code? I haven't seen anyone make that claim in years, even from the most skeptical critics.

▲

IshKebab 2 minutes ago | parent | next [-]

I'll claim it. They can't generate working code for the things I am working on. They seem to be too complex or in languages that are too niche.

They can do a tolerable job with super popular /simple things like web dev and Python. It really depends on what you're doing.

▲

autoexec 5 hours ago | parent | prev | next [-]

I've seen it said plenty of the times that the code might work eventually (after several cycles of prompting and testing), but even then the code you get might not be something you'd want to maintain, and it might contain bugs and security issues that don't (at least initially) seem to impact its ability to do whatever it was written to do but which could cause problems later.

	▲	slopinthebag 5 hours ago \| parent [-]
		Yeah but that's a completely different thing.

▲

zelphirkalt an hour ago | parent | prev | next [-]

Depends what they mean. Generate working code all the time or after going a few iterations of trying and promoting? It can very easily happen, that an LLM generates something that is a straight error, because it hallucinates some keyword argument or something like that, which doesn't actually exist. Only happened to me yesterday. So going from that, no, they are still not able to generate working code all the time. Especially, when the basis is a shoddy-made library itself, that is simply missing something required.

▲

KellyCriterion 5 hours ago | parent | prev | next [-]

Scroll up a few comments where someone said Claude is generating errors over and over again and that Claude cant work according to code guidelines etc :-))

▲

dangus 5 hours ago | parent | prev [-]

And really the problem isn’t that it can’t make working code, the problem is that it’ll never get the kind of context that is in your brain.

I started working today on a project I hadn’t touched in a while but I now needed to as it was involved in an incident where I needed to address some shortcomings. I knew the fix I needed to do but I went about my usual AI assisted workflow because of course I’m lazy the last thing I want to do is interrupt my normal work to fix this stupid problem.

The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it. I can give it a lot of instructions but it’s impossible to write out everything in my head across multiple systems.

The AI did write working code, but despite writing the code way faster than me, it made small but critical mistakes that I wouldn’t have made on my first draft.

For example, it just added in a command flag that I knew that it didn’t need, and it actually probably should have known it, too. Basically it changed a line of code that it didn’t need to touch.

It also didn’t realize that the curled URL was going to redirect so we needed an -L flag. Maybe it should have but my brain knew it already.

It also misinterpreted some changes in direction that a human never would have. It confused my local repository for the remote one because I originally thought I was going to set a mirror, but I changed plans and used a manual package upload to curl from. So it out the remote URL in some places where the local one should have been.

Finally, it seems to have just created some strange text gore while editing the readme where it deleted existing content for seemingly no reason other than some kind of readline snafu.

So yes it produced very fast great code that would have taken me way longer to do, but I had to go back and consume a very similar amount of time to fix so many things that I might as well have just done it manually.

But hey I’m glad my company is paying $XX/month for my lazy workday machine.

	▲	KellyCriterion 5 hours ago \| parent [-]
		>>The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it.<< This is your problem: How should it know if you do not provide it? Use Claude - in the pro version you can submit files for each project which are setting the context: This can be files, source code, SQL scripts, screenshots whatever - then the output will be based on your context given by providing these files.