Andrej Karpathy talks about "Claws"

▲ Andrej Karpathy talks about "Claws"(simonwillison.net)

114 points by helloplanets 3 hours ago | 152 comments

▲ ggrab 2 hours ago | parent | next [-]

IMO the security pitchforking on OpenClaw is just so overdone. People without consideration for the implications will inevitably get burned, as we saw with the reddit posts "Agentic Coding tool X wiped my hard drive and apologized profusely". I work at a FAANG and every time you try something innovative the "policy people" will climb out of their holes and put random roadblocks in your way, not for the sake of actual security (that would be fine but would require actual engagement) but just to feel important, it reminds me of that.

▲

throwaway_z0om an hour ago | parent | next [-]

> the "policy people" will climb out of their holes

I am one of those people and I work at a FANG.

And while I know it seems annoying, these teams are overwhelmed with not only innovators but lawyers asking so many variations of the same question it's pretty hard to get back to the innovators with a thumbs up or guidance.

Also there is a real threat here. The "wiped my hard drive" story is annoying but it's a toy problem. An agent with database access exfiltrating customer PII to a model endpoint is a horrific outcome for impacted customers and everyone in the blast radius.

That's the kind of thing keeping us up at night, not blocking people for fun.

I'm actively trying to find a way we can unlock innovators to move quickly at scale, but it's a bit of a slow down to go fast moment. The goal isn't roadblocks, it's guardrails that let you move without the policy team being a bottleneck on every request.

▲

madeofpalk 6 minutes ago | parent | next [-]

I know it’s what the security folk think about, exfiltrating to a model endpoint is the least of my concerns.

I work on commercial OSS. My fear is that it’s exfiltrated to public issues or code. It helpfully commits secrets or other BS like that. And that’s even ignoring prompt injection attacks from the public.

▲

mikkupikku an hour ago | parent | prev | next [-]

I am sure there are many good corporate security policy people doing important work. But then there are people like this;

I get handed an application developed by my company for use by partner companies. It's a java application, shipped as a jar, nothing special. It gets signed by our company, but anybody with the wherewithal can pull the jar apart and mod the application however they wish. One of the partner companies has already done so, extensively, and come back to show us their work. Management at my company is impressed and asks me to add official plugin support to the application. Can you guess where this is going?

I add the plugin support,the application will now load custom jars that implement the plugin interface I had discussed with devs from that company that did the modding. They think it's great, management thinks its great, everything works and everybody is happy. At the last minute some security policy wonk throws on the brakes. Will this load any plugin jar? Yes. Not good! It needs to only load plugins approved by the company. Why? Because! Never mind that the whole damn application can be unofficially nodded with ease. I ask him how he wants that done, he says only load plugins signed by the company. Retarded, but fine. I do so. He approves it, then the partner company engineer who did the modding chimes in that he's just going to mod the signature check out, because he doesn't want to have to deal with this shit. Security asshat from my company has a melt down and long story short the entire plugin feature, which was already complete, gets scrapped and the partner company just keeps modding the application as before. Months of my life down the drain. Thanks guys, great job protecting... something.

▲

embedding-shape 30 minutes ago | parent [-]

So why are these people not involved from the first place? Seems like a huge management/executive failure that the right people who needs to check off the design weren't involved until after developers implemented the feature.

You seem to blame the person who is trying to save the company from security issues, rather than placing the blame on your boss that made you do work that would never gotten approved in the first place if they just checked with the right person first?

▲

mikkupikku 24 minutes ago | parent [-]

Because they don't respond to their emails until months after they were nominally brought into the loop. They sit back jerking their dicks all day, voicing no complaints and giving no feedback until the thing is actually done.

Yes, management was ultimately at fault. They're at fault for not tard wrangling the security guys into doing their jobs up front. They're also at fault for not tard wrangling the security guys when they object to an inherently modifiable application being modified.

▲

embedding-shape 22 minutes ago | parent [-]

Again sounds like a management failure. Why aren't you boss talking with their boss and asking what the fuck is going on, and putting the development on hold until it's been agreed on? Again your boss is the one who is wasting your time, they are the one responsible for that what you spend your time on is actually useful and valuable, which they clearly messed up in that case.

	▲	mikkupikku 14 minutes ago \| parent [-]
		As I already said, management ultimately is the root of the blame. But what you don't seem to get is that at least some of their blame is from hiring dumbasses into that security review role. Why did the security team initially give the okay to checking signatures on plugin jars? They're supposed to be security experts, what kind of security expert doesn't know that a signature check like that could be modded out? I knew it when I implemented it, and the modder at the partner corp obviously knew it but lacked the tact to stay quiet about it. Management didn't realize it, but they aren't technical. So why didn't security realize it until it was brought to their attention? Because they were retarded. By the way, this application is still publicly downloadable, still easily modded, and hasn't been updated in almost 10 years now. Security review is fine with that, apparently. They only get bent out of shape when somebody actually tries to make something more useful, not when old nominally vulnerable software is left to rot in public. They're not protecting the company from a damn thing.

▲

Myrmornis 31 minutes ago | parent | prev [-]

The main problem with many IT and security people at many tech companies is that they communicate in a way that betrays their belief that they are superior to their colleagues.

"unlock innovators" is a very mild example; perhaps you shouldn't be a jailor in your metaphors?

	▲	criley2 25 minutes ago \| parent [-]
		I find it interesting that you latched on their jailor metaphor, but had nothing to say about their core goal: protecting my privacy. I'm okay with the people in charge of building on top of my private information being jailed by very strict, mean sounding, actually-higher-than-you people whose only goal is protecting my information. Quite frankly, if you changed any word of that, they'd probably be impotent and my data would be toast.

▲

weinzierl 25 minutes ago | parent | prev | next [-]

I think there are two different things at work here that deserve to be separated:

1. The compliance box tickers and bean counters are in the way of innovation and it hurts companies.

2. Claws derive their usefulness mainly from having broad permissions, not only to you local system but also to your accounts via your real identity [1]. Carefulness is very much warranted.

[1] People correct me if I'm misguided, but that is how I see it. Run the bot in a sandbox with no data and a bunch of fake accounts and you'll see how useful that is.

	▲	enderforth 10 minutes ago \| parent [-]
		It's been my experience that there are 2 types of security people. 1. Are the security people who got into a security because it was one of the only places that let them work with every part of the stack, and exposure to dozens of different domains on the regular, and the idea of spending hours understanding and then figuring out ways around whitelist validations are appealing 2. Those that don't have much technical chops, but can get by with a surface level understanding of several areas and then perform "security shamanism" to intimidate others and pull out lots of jargon. They sound authoritative because information security is a fairly esoteric concept and because you can't argue against security like you can't argue against health and safety, the only response is "so you don't care about security?!" It is my experience that the first are likely to work with you to help figure out how to get your application past the hurdles and challenges you face viewing it as an exciting problem. The second view their job as to "protect the organization" not deliver value. They love playing dressup in security theater and their depth of their understanding doesn't even pose a drowning risk to infants, which they make up for with esoterica, and jargon. They are also unfortunately the one's cooking up "standards" and "security policies" because it allows them to feel like they are doing real work, without the burden of actually knowing what they are doing, and talented people are actually doing something. Here's a good litmus test to distinguish them, ask their opinion on the CISSP. If it's positive they probably don't know what the heck they are talking about. Source: A long career operating in multiple domains, quite a few of which have been in security having interacted with both types (and hoping I fall into the first camp rather than the latter)

▲

beaker52 18 minutes ago | parent | prev | next [-]

The difference is that _you_ wiped your own hard drive. Even if prompt injection arrives by a scraped webpage, you still pressed the button.

All these claws throw caution to the wind in enabling the LLM to be triggered by text coming from external sources, which is another step in wrecklessness.

▲

pvtmert an hour ago | parent | prev | next [-]

I am also ex-FAANG (recently departed), while I partially agree the "policy-people" pop-up fairly often, my experience is more on the inadequate checks side.

Though with the recent layoffs and stuff, the security in Amazon was getting better. Even the best-practices for IAM policies that was the norm in 2018, is just getting enforced by 2025.

Since I had a background of infosec, it always confused me how normal it was to give/grant overly permissive policies to basically anything. Even opening ports to worldwide (0.0.0.0/0) had just been a significant issue in 2024, still, you can easily get away with by the time the scanner finds your host/policy/configuration...

Although nearly all AWS accounts managed by Conduit (internal AWS Account Creation and Management Service), the "magic-team" had many "account-containers" to make all these child/service accounts joining into a parent "organization-account". By the time I left, the "organization-account" had no restrictive policies set, it is up to the developers to secure their resources. (like S3 buckets & their policies)

So, I don't think the policy folks are overall wrong. In the best case scenario, they do not need to exist in the first place! As the enforcement should be done to ensure security. But that always has an exception somewhere in someone's workflow.

▲

franze 35 minutes ago | parent | prev | next [-]

my time at a money startup (debit cards) i pushed to legal and security people to change their behaviour from "how can we prevent this" to "how can we enable this - while still staying with the legal and security framework" worked good after months of hard work and day long meetings.

then the heads changed and we were back to square one.

but for a moment it was glorious of what was possible.

	▲	fragmede 8 minutes ago \| parent [-]
		It's a cultural thing. I loved working at Google because the ethos was "you can do that, and i'll even help you, but have you considered $reason why your idea is stupid/isn't going to work?"

▲

whyoh an hour ago | parent | prev | next [-]

>IMO the security pitchforking on OpenClaw is just so overdone.

Isn't the whole selling point of OpenClaw that you give it valuable (personal) data to work on, which would typically also be processed by 3rd party LLMs?

The security and privacy implications are massive. The only way to use it "safely" is by not giving it much of value.

▲

sa-code 2 hours ago | parent | prev | next [-]

> every time you try something innovative the "policy people" will climb out of their holes and put random roadblocks in your way

This is so relatable. I remember trying to set up an LLM gateway back in 2023. There were at least 3 different teams that blocked our rollout for months until they worked through their backlog. "We're blocking you, but you’ll have to chase and nag us for us to even consider unblocking you"

At the end of all that waiting, nothing changed. Each of those teams wrote a document saying they had a look and were presumably just happy to be involved somehow?

▲

miki123211 an hour ago | parent | next [-]

I think you should read "the Phoenix project."

One of the lessons in that book is that the main reasons things in IT are slow isn't because tickets take a long time to complete, but that they spend a long time waiting in a queue. The busier a resource is, the longer the queue gets, eventually leading to ~2% of the ticket's time spent with somebody doing actual work on it. The rest is just the ticket waiting for somebody to get through the backlog, do their part and then push the rest into somebody else's backlog, which is just as long.

I'm surprised FAANGs don't have that part figured out yet.

▲

embedding-shape an hour ago | parent | prev | next [-]

To be fair, the alternative is them having to maintain and continuously check N services that various devs deployed because it felt appropriate in the moment, and then there is a 50/50 chance the service will just sit there unused and introduce new vulnerability vectors.

I do know the feeling you're talking about though, and probably a better balance is somewhere in the middle. Just wanted to add that the solution probably isn't "Let devs deploy their own services without review", just as the solution probably also isn't "Stop devs for 6 months to deploy services they need".

	▲	regularfry 8 minutes ago \| parent [-]
		The trick is to make the class of pre-approved service types as wide as possible, and make the tools to build them correctly the default. That minimises the number of things that need review in the first place.

▲

pvtmert an hour ago | parent | prev [-]

From my experience, it depends on how you frame your "service" to the reviewers. Obviously 2023 was the very early stage of LLMs, where the security aspects were quite murky at best. They (reviewers) probably did not had any runbook or review criteria at that time.

If you had advertised this as a "regular service which happens to use LLM for some specific functions" and the "output is rigorously validated and logged", I am pretty sure you would get a green-light.

This is because their concern is data-privacy and security. Not because they care or the company actually cares, but because fines of non-compliance are quite high and have greater visibility if things go wrong.

▲

latexr 44 minutes ago | parent | prev | next [-]

> People without consideration for the implications will inevitably get burned

They will also burn other people, which is a big problem you can’t simply ignore.

https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...

But even if they only burned themselves, you’re talking as if that isn’t a problem. We shouldn’t be handing explosives to random people on the street because “they’ll only blow their own hands”.

▲

H8crilA 2 hours ago | parent | prev | next [-]

This may be a good place to exchange some security ideas. I've configured my OpenClaw in a Proxmox VM, firewalled it off of my home network so that it can only talk to the open Internet, and don't store any credentials that aren't necessary. Pretty much only the needed API keys and Signal linked device credentials. The models that can run locally do run locally, for example Whisper for voice messages or embeddings models for semantic search.

▲

embedding-shape 2 hours ago | parent | next [-]

I think the security worries are less about the particular sandbox or where it runs, and more about that if you give it access to your Telegram account, it can exfiltrate data and cause other issues. But if you never hand it access to anything, obviously it won't be able to do any damage, unless you instruct it to.

▲

kzahel an hour ago | parent [-]

You wouldn't typically give it access to your own telegram account. You use the telegram bot API to make a bot and the claw gateway only listens to messages from your own account

▲

embedding-shape an hour ago | parent [-]

That's a very different approach, and a bot user is very different from a regular Telegram account, it won't be nearly as "useful", at least in the way I thought openclaw was supposed to work.

For example, a bot account cannot initiate conversations, so everyone would need to first message the bot, doesn't that defeat the entire purpose of giving openclaw access to it then? I thought they were supposed to be your assistant and do outbound stuff too, not just react to incoming events?

	▲	arcwhite 12 minutes ago \| parent [-]
		Once a conversation with a user is established, telegram bots can bleep away at you. Mine pings me whenever it puts a PR up, and when it's done responding to code reviews etc.

▲

dakolli an hour ago | parent | prev [-]

Genuinely curious, what are you doing with OpenClaw that genuinely improves your life?

The security concerns are valid, I can get anyone running one of these agents on their email inbox to dump a bunch of privileged information with a single email..

▲

0x3f 2 hours ago | parent | prev | next [-]

Work expands to fill the allocated resources in literally everything. This same effect can be seen in software engineering complexity more generally, but also government regulators, etc. No department ever downsizes its own influence or budget.

▲

aaronrobinson an hour ago | parent | prev | next [-]

It’s not to feel important, it’s to make others feel they’re important. This is the definition of corporate.

▲

imiric 33 minutes ago | parent | prev [-]

> I work at a FAANG and every time you try something innovative the "policy people" will climb out of their holes and put random roadblocks in your way

What a surprise that someone working in Big Tech would find "pesky" policies to get in their way. These companies have obviously done so much good for the world; imagine what they could do without any guardrails!

▲ bjackman 3 hours ago | parent | prev | next [-]

The actual content: https://xcancel.com/karpathy/status/2024987174077432126

	▲	fxj 2 hours ago \| parent [-]
		He also talks about picoclaw (a IoT solution) and nanoclaw (running on your phone in termux) and has a tiny code base.

▲ nevertoolate 11 minutes ago | parent | prev | next [-]

My summary: openclaw is a 5/5 security risk, if you have a perfectly audited nanoclaw or whatever it is 4/5 still. If it runs with human-in-the-loop it is much better, but the value is quickly diminishing. I think llms are not bad at helping to spec down human language and possibly doing great also in creating guardrails via tests, but i’d prefer something stable over llms running in “creative mode” or “claw” mode.

▲ ozim 10 minutes ago | parent | prev | next [-]

I am waiting for Mac mini with M5 processor since M5 MacBook - seems like I need to start saving more money each month for that goal because it is going to be a bloodbath at the moment they land.

▲ mittermayr 2 hours ago | parent | prev | next [-]

I wonder how long it'll take (if it hasn't already) until the messaging around this inevitably moves on to "Do not self-host this, are you crazy? This requires console commands, don't be silly! Our team of industry-veteran security professionals works on your digital safety 24/7, you would never be able to keep up with the demands of today's cybersecurity attack spectrum. Any sane person would host their claw with us!"

Next flood of (likely heavily YC-backed) Clawbase (Coinbase but for Claws) hosting startups incoming?

	▲	xg15 2 hours ago \| parent \| next [-]
		What exactly are they self hosting here? Probably not the model, right? So just the harness? That does sound like the worst of both worlds: You get the dependency and data protection issues of a cloud solution, but you also have to maintain a home server to keep the agent running on?
	▲	pvtmert 42 minutes ago \| parent \| prev \| next [-]
		Great idea, happy to ~steal~ be inspired by. I propose a few other common elements: 1. Another AI agent (actually bunch of folks in a 3rd-world country) to gatekeep/check select input/outputs for data leaks. 2. Using advanced network isolation techniques (read: bunch of iptables rules and security groups) to limit possible data exfiltration. `This would actually be nice, as the agent for whatsapp would run in a separate entity with limited network access to only whatsapp's IP ranges...` 3. Advanced orchestration engine (read: crontab & bunch of shell scripts) that are provided as 1st-party components to automate day-to-day stuff. `Possibly like IFTTT/Zapier/etc. like integration, where you drag/drop objectives/tasks in a declarative format and the agent(s) figure out the rest...`
	▲	aitchnyu an hour ago \| parent \| prev \| next [-]
		There are lots of results for "host openclaw", some from VPS SEO spam, some from dedicated CaaS, some from PaaS. Many of them may be profitable.
	▲	empath75 13 minutes ago \| parent \| prev \| next [-]
		I already built an operator so we can deploy nanoclaw agents in kubernetes with basically a single yaml file. We're already running two of them in production (PR reviews and ticket triaging)
	▲	iugtmkbdfil834 2 hours ago \| parent \| prev [-]
		In a sense, self-hosting it ( and I would argue for a personal rewrite ) is the only way to limit some of the damage.

▲ ZeroGravitas 3 hours ago | parent | prev | next [-]

So what is a "claw" exactly?

An ai that you let loose on your email etc?

And we run it in a container and use a local llm for "safety" but it has access to all our data and the web?

▲

mattlondon 2 hours ago | parent | next [-]

I think for me it is an agent that runs on some schedule, checks some sort of inbox (or not) and does things based on that. Optionally it has all of your credentials for email, PayPal, whatever so that it can do things on your behalf.

Basically cron-for-agents.

Before we had to go prompt an agent to do something right now but this allows them to be async, with more of a YOLO-outlook on permissions to use your creds, and a more permissive SI.

Not rocket science, but interesting.

▲

snovv_crash 2 hours ago | parent | next [-]

Cron would be for a polling model. You can also have an interrupts/events model that triggers it on incoming information (eg. new email, WhatsApp, incoming bank payments etc).

I still don't see a way this wouldn't end up with my bank balance being sent to somewhere I didn't want.

▲

altmanaltman 2 hours ago | parent | prev [-]

Definitely interesting but i mean giving it all my credentials feels not right. Is there a safe way to do so?

▲

dlt713705 2 hours ago | parent | next [-]

In a VM or a separate host with access to specific credentials in a very limited purpose.

In any case, the data that will be provided to the agent must be considered compromised and/or having been leaked.

My 2 cents.

	▲	krelian 44 minutes ago \| parent [-]
		Maybe I'm missing something obvious but, being contained and only having access to specific credentials is all nice and well but there is still an agent that orchestrates between the containers that has access to everything with one level of indirection.

▲

isuckatcoding 2 hours ago | parent | prev [-]

Ideally workflow would be some kind of Oauth with token expirations and some kind of mobile notification for refresh

▲

bravura an hour ago | parent | prev | next [-]

There are a few qualitative product experiences that make claw agents unique.

One is that it relentlessly strives thoroughly to complete tasks without asking you to micromanage it.

The second is that it has personality.

The third is that it's artfully constructed so that it feels like it has infinite context.

The above may sound purely circumstantial and frivolous. But together it's the first agent that many people who usually avoid AI simply LOVE.

▲

krelian 42 minutes ago | parent [-]

Can you give some example for what you use it for? I understand giving a summary of what's waiting in your inbox but what else?

	▲	amelius 11 minutes ago \| parent [-]
		Extending your driver's license. Asking the bank for a second mortgage. Finding the right high school for your kids. The possibilities are endless.

▲

nnevatie 2 hours ago | parent | prev | next [-]

That's it basically. I do not think running the tool in a container really solves the fundamental danger these tools pose to your personal data.

	▲	zozbot234 2 hours ago \| parent [-]
		You could run them in a container and put access to highly sensitive personal data behind a "function" that requires a human-in-the-loop for every subsequent interaction. E.g. the access might happen in a "subagent" whose context gets wiped out afterwards, except for a sanitized response that the human can verify. There might be similar safeguards for posting to external services, which might require direct confirmation or be performed by fresh subagents with sanitized, human-checked prompts and contexts.

▲

fxj 2 hours ago | parent | prev [-]

A claw is an orchestrator for agents with its own memory, multiprocessing, job queue and access to instant messengers.

▲ 7777777phil 3 hours ago | parent | prev | next [-]

Karpathy has a good ear for naming things.

"Claw" captures what the existing terminology missed, these aren't agents with more tools (maybe even the opposite), they're persistent processes with scheduling and inter-agent communication that happen to use LLMs for reasoning.

▲

dakolli an hour ago | parent | next [-]

He's basically just a marketing guy now for the AI industry.

▲

UncleMeat 13 minutes ago | parent | prev | next [-]

How does "claw" capture this? Other than being derived from a product with this name, the word "claw" doesn't seem to connect to persistence, scheduling, or inter-agent communication at all.

▲

arrowsmith 2 hours ago | parent | prev | next [-]

He didn't name it though, Peter Steinberger did. (Kinda.)

▲

9dev 2 hours ago | parent | prev [-]

Why do we always have to come up with the stupidest names for things. Claw was a play on Claude, is all. Granted, I don’t have a better one at hand, but that it has to be Claw of all things…

▲

keiferski 2 hours ago | parent | next [-]

The real-world cyberpunk dystopia won’t come with cool company names like Arasaka, Sense/Net, or Ono-Sendai. Instead we get childlike names with lots of vowels and alliteration.

	▲	anewhnaccount2 an hour ago \| parent \| next [-]
		Except Phillip K Dick calls the murder bots in Second Variety claws already so there's prior art right from the master of cyberpunk.
	▲	m4rtink 2 hours ago \| parent \| prev [-]
		The name still kinda reminds me of the self replicating murder drones from Screemers that would leep out from the ground and chop your head off. ;-)

▲

mmasu an hour ago | parent | prev | next [-]

I am reading a book called Accelerando (highly recommended), and there is a play on a lobsters collective uploaded to the cloud. Claws reminded me of that - not sure it was an intentional reference tho!

▲

JumpCrisscross an hour ago | parent | prev | next [-]

> I don’t have a better one at hand

Perfect is the enemy of good. Claw is good enough. And perhaps there is utility to neologisms being silly. It conveys that the namespace is vacant.

▲

sunaookami an hour ago | parent | prev | next [-]

The name fits since it will claw all your personal data and files and send them somewhere else.

	▲	jcgrillo an hour ago \| parent [-]
		Much like we now say somebody has been "one-shotted", might we now say they have been "clawed"?

▲

jcgrillo an hour ago | parent | prev [-]

I've been hoping one of them will be called Clod

▲ tomjuggler 3 hours ago | parent | prev | next [-]

There's a gap in the market here - not me but somebody needs to build an e-commerce bot and call it Santa Claws

	▲	intrasight an hour ago \| parent [-]
		Well now somebody will

▲ hizanberg 2 hours ago | parent | prev | next [-]

Why is this linking to a blog post of what someone said, instead of directly linking to what they said?

[1] https://x.com/karpathy/status/2024987174077432126

▲

rvz 2 hours ago | parent | next [-]

Because the author of the blog is paid to post daily about nothing but AI and needs to link farm for clicks and engagement on a daily basis.

Most of the time, users (or the author himself) submit this blog as the source, when in fact it is just content that ultimately just links to the original source for the goal of engagement. Unfortunately, this actually breaks two guidelines: "promotional spam" and "original sourcing".

From [0]

"Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity."

and

"Please submit the original source. If a post reports on something found on another site, submit the latter."

The moderators won't do anything because they are allowing it [1] only for this blog.

[0] https://news.ycombinator.com/newsguidelines.html

[1] https://news.ycombinator.com/item?id=46450908

▲

helloplanets 2 minutes ago | parent | next [-]

> Because the author of the blog is paid to post daily about nothing but AI and needs to link farm for clicks and engagement on a daily basis.

Care to elaborate? Paid by whom?

▲

odshoifsdhfs 2 hours ago | parent | prev | next [-]

Hah i didn’t see who submitted it but as soon as I read your message i thought it was simonw, and behold, tada!

HN really needs a way to block or hide posts from some users.

▲

consumer451 an hour ago | parent [-]

Ironically, you could probably generate a browser extension or user script to do that in one to three prompts.

	▲	agmater 32 minutes ago \| parent [-]
		If you can't one-shot that you've been declawed /s

▲

PacificSpecific 2 hours ago | parent | prev | next [-]

Yeah it's really quite annoying. Is there a way to just block his site source from showing up on here without using external tools?

▲

bahmboo an hour ago | parent [-]

I find is very easy to hit the hide button. It makes reading the site much faster but there is some feeling of fomo.

	▲	PacificSpecific an hour ago \| parent [-]
		That's per-post though isn't? I can't ban a submission source can I? Regardless thanks for the tip

▲

bahmboo 2 hours ago | parent | prev | next [-]

The author didn't submit this to HN. I read his blog but I'm not on X so I do like when he covers things there. He's submitted 10 times in last 62 days.

▲

bakugo an hour ago | parent [-]

> He's submitted 10 times in last 62 days.

Now check how many times he links to his blog in comments.

Actually, here, I'll do it for you: He has made 13209 comments in total, and 1422 of those contain a link to his blog[0]. An objectively ridiculous number, and anyone else would've likely been banned or at least told off for self-promotion long before reaching that number.

[0] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

	▲	bahmboo an hour ago \| parent [-]
		I like being able to follow tangents and related topics outside the main comment thread so generally I appreciate when people do that via a link along with some context. But this isn't my site and I don't get to pick the rules.

▲

nl 2 hours ago | parent | prev | next [-]

Simon's work is always appreciated. He thinks through things well, and his writing is excellent.

Just because something is popular doesn't make it bad.

	▲	sunaookami an hour ago \| parent \| next [-]
		He massively fell off, is now only in for the marketing hype and even has a sponsor now for his blog. Sad.
	▲	UncleMeat 11 minutes ago \| parent \| prev [-]
		"Self promotion is allowed if your content is sufficiently good" is odd.

▲

geeunits 2 hours ago | parent | prev | next [-]

I've been warned for calling this out, but I'm glad others are privy to the obvious

▲

hizanberg 2 hours ago | parent | prev | next [-]

So everyone has to waste their time to visit a link on a blog first instead of being able to go directly to the source?

and why would anyone down vote you for calling this out, like who wants to see more low effort traffic-grab posts like this?

	▲	bahmboo 2 hours ago \| parent [-]
		Because he didn't submit it.

▲

Der_Einzige 2 hours ago | parent | prev [-]

Thank you for calling this out. The individual in question is massively overhyped.

▲

handfuloflight 2 hours ago | parent | prev [-]

Because Simon says.

▲ pvtmert an hour ago | parent | prev | next [-]

Does one really need to _buy_ a completely new desktop hardware (ie. mac mini) to _run_ a simple request/response program?

Excluding the fact that you can run LLMs via ollama or similar directly on the device, but that will not have a very good token/s speed as far as I can guess...

	▲	titanomachy 39 minutes ago \| parent [-]
		I’m pretty sure people are using them for local inference. Token rates can be acceptable if you max out the specs. If it was just the harness, they’d use a $20 raspberry pi instead.

▲ ksynwa 2 hours ago | parent | prev | next [-]

Why mac mini instead of something like a raspberry pi? Aren't thede claw things delegating inference to OpenAI, Antropic etc.?

▲

kator 2 hours ago | parent | next [-]

Some users are moving to local models, I think, because they want to avoid the agent's cost, or they think it'll be more secure (not). The mac mini has unified memory and can dynamically allocate memory to the GPU by stealing from the general RAM pool so you can run large local LLMs without buying a massive (and expensive) GPU.

▲

djfergus 2 hours ago | parent | prev [-]

A Mac allows it to send iMessage and access the Apple ecosystem.

▲

ksynwa 2 hours ago | parent [-]

Really? That's it?

	▲	labcomputer 22 minutes ago \| parent \| next [-]
		I think the mini is just a better value, all things considered: First, a 16GB RPi that is in stock and you can actually buy seems to run about $220. Then you need a case, a power supply (they're sensitive, not any USB brick will do), an NVMe. By the time it's all said and done, you're looking at close to $400. I know HN likes to quote the starting price for the 1GB model and assume that everyone has spare NVMe sticks and RPi cases lying around, but $400 is the realistic price for most users who want to run LLMs. Second, most of the time you can find Minis on sale for $500 or less. So the price difference is less than $100 for something that comes working out of the box and you don't have to fuss with. Then you have to consider the ecosystem: * Accelerated PyTorch works out of the box by simply changing the device from 'cuda' to 'mps'. In the real world, an M5 mini will give you a decent fraction of V100 performance (For reference, M2 Max is about 1/3 the speed of a V100, real-world). * For less technical users, Ollama just works. It has OpenAI and Anthropic APIs out of the box, so you can point ClaudeCode or OpenCode at it. All of this can be set up from the GUI. * Apple does a shockingly good job of reducing power consumption, especially idle power consumption. It wouldn't surprise me if a Pi5 has 2x the idle draw of a Mini M5. That matters for a computer running 24/7.
	▲	joshstrange an hour ago \| parent \| prev [-]
		Ehh, not “it” but it’s important if you want an agent to have access to all your “stuff”. macOS is the only game in town if you want easy access to iMessage, Photos, Reminders, Notes, etc and while Macs are not cheap, the baseline Mac Mini is a great deal. A raspberry Pi is going to run you $100+ when all is said and done and a Mac Mini is $600. So let’s call it. $500 difference. A Mac Mini is infinitely more powerful than a Pi, can run more software, is more useful if you decide to repurpose it, has a higher resale value and is easier to resell, is just more familiar to more people, and it just looks way nicer. So while iMessage access is very important, I don’t think it comes close to being the only reason, or “it”. I’d also imagine that it might be easier to have an agent fake being a real person controlling a browser on a Mac verses any Linux-based platform. Note: I don’t own a Mac Mini nor do I run any Claw-type software currently.

▲ fxj 2 hours ago | parent | prev | next [-]

He also talks about picoclaw which even runs on $10 hardware and is a fork by sipeed, a chinese company who does IoT.

https://github.com/sipeed/picoclaw

another chinese coompany m5stack provides local LLMs like Qwen2.5-1.5B running on a local IoT device.

https://shop.m5stack.com/products/m5stack-llm-large-language...

Imagine the possibilities. Soon we will see claw-in-a-box for less than $50.

	▲	backscratches 19 minutes ago \| parent [-]
		It's just sending API calls to anthropic, $50 is overkill.

▲ bravetraveler 2 hours ago | parent | prev | next [-]

I read [and comment on] two influencers maintaining their circles

▲ rolymath 14 minutes ago | parent | prev | next [-]

I love Andrej Karpathy and I think he's really smart but Andrej is responsible for popularizing the two most nauseating terms in the AI world. "Vibe" coding, and now "claws".

I'm one nudge away from throwing up.

▲ mhher an hour ago | parent | prev | next [-]

The current hype around agentic workflows completely glosses over the fundamental security flaw in their architecture: unconstrained execution boundaries. Tools that eagerly load context and grant monolithic LLMs unrestricted shell access are trivial to compromise via indirect prompt injection.

If an agent is curling untrusted data while holding access to sensitive data or already has sensitive data loaded into its context window, arbitrary code execution isn't a theoretical risk; it's an inevitability.

As recent research on context pollution has shown, stuffing the context window with monolithic system prompts and tool schemas actively degrades the model's baseline reasoning capabilities, making it exponentially more vulnerable to these exact exploits.

▲

kzahel an hour ago | parent | next [-]

I think this is basically obvious to anyone using one of these but they're just they like the utility trade off like sure it may leak and exfiltrate everything somewhere but the utility of these tools is enough where they just deal with that risk.

	▲	mhher an hour ago \| parent [-]
		While I understand the premise I think this is a highly flawed way to operate these tools. I wouldn't want to have someone with my personal data (whichever part) that might give it to anyone who just asks nicely because the context window has reached a tipoff point for the models intelligence. The major issue is a prompt attack may have taken place and you will likely never find out.

▲

dgellow an hour ago | parent | prev [-]

could you share that study?

	▲	mhher an hour ago \| parent [-]
		https://arxiv.org/abs/2512.13914 Among many more of them with similar results. This one gives a 39% drop in performance. https://arxiv.org/abs/2506.18403 This one gives 60-80% after multiple turns.

▲ Dilettante_ an hour ago | parent | prev | next [-]

I still haven't really been able to wrap my head around the usecase for these. Also fingers crossed the name doesn't stick. Something about it rubs my brain the wrong way.

	▲	ehnto an hour ago \| parent [-]
		It's just agents as you might know them, but running constantly in a loop, with access to all your personal accounts. What could go wrong.

▲ bjackman 3 hours ago | parent | prev | next [-]

Does anyone know a Claw-like that:

- doesnt do its own sandboxing (I'll set that up myself)

- just has a web UI instead of wanting to use some weird proprietary messaging app as its interface?

▲

kzahel an hour ago | parent | next [-]

https://yepanywhere.com/ But has no Cron system. Just relay / remote web UI that's mobile first. I might add Cron system to it, but I think special purpose tool is better / more focused (I am the author of this)

▲

tokenless 2 hours ago | parent | prev [-]

Openclaw!

You can sandbox anything yourself. Use a VM.

It has a web ui.

	▲	bjackman 2 hours ago \| parent [-]
		Yeah I think this is gonna have to be the approach. But I don't like the fact that it has all the complexity of a baked in sandboxing solution and a big plugin architecture and blah blah blah. TBH maybe I should just vibe code my own...

▲ trippyballs 3 hours ago | parent | prev | next [-]

lemme guess there is going to be inter claw protocol now

	▲	tokenless 2 hours ago \| parent [-]
		i am thinking 2 steps (48 hours in ai land) ahead and conclude we need a linkedin and fiverr for these claws.

▲ dainiusse an hour ago | parent | prev | next [-]

I don't understand the mac mini hype. Why can it not be a vm?

	▲	netruk44 22 minutes ago \| parent \| next [-]
		The original reason was because people wanted to use iMessage as a transport for agent commands, sent to the agent’s own Apple account. It’s easier to set up iMessage with real Apple machines, and the Mac mini is the cheapest Apple hardware with a terminal (For now, anyway. That might change in March) If you don’t want an iMessage-based agent, you don’t need a Mac mini.
	▲	Aditya_Garg an hour ago \| parent \| prev \| next [-]
		It absolutely can be a vm. Someone even got it running on a 2 dollar esp32. Its just making api calls
	▲	borplk an hour ago \| parent \| prev [-]
		I don't know but I'm guessing that it's because it makes it easy to give access to it to Mac desktop apps? Not sure what's the VM story with Mac but usually cloud VM stuff is linux so it may be inconvenient for some users to hook it up to their apps/tools.

▲ zkmon 3 hours ago | parent | prev | next [-]

AI pollution is "clawing" into every corner of human life. Big guys boast it as catching up with the trend, but not really thinking about where this is all going.

▲ _pdp_ 2 hours ago | parent | prev | next [-]

You can take any AI agent (Codex, Gemini, Claude Code, ollama), run it on a loop with some delay and connect to a messaging platform using Pantalk (https://github.com/pantalk/pantalk). In fact, you can use Pantalk buffer to automatically start your agent. You don't need OpenClaw for that.

What OpenClaw did is to show the messages that this is in fact possible to do. IMHO nobody is using it yet for meaningful things, but the direction is right.

▲

sergiomattei 18 minutes ago | parent [-]

No shade, I think it looks cool and will likely use it, but next time maybe disclose that you’re the founder?

	▲	_pdp_ 10 minutes ago \| parent [-]
		Good point and I will keep that in mind next time. I am not a founder of this though. This is not a business. It is an open-source project.

▲ lysecret an hour ago | parent | prev | next [-]

Im honestly not that much worried there are some obvious problems (exfiltrate data labeled as sensitive, take actions that are costly, delete/change sensitive resources) if you have a properly compliant infrastructure all these actions need confirmations logging etc. for humans this seemed more like a neusance but now it seems essential. And all these systems are actually much much easier to setup.

▲ TowerTall 3 hours ago | parent | prev | next [-]

Who is Andrej Karpathy?

▲

password54321 2 hours ago | parent | next [-]

Someone who uses status to appeal to the tech masses / tech influencer / AI hype man.

▲

onion2k 3 hours ago | parent | prev | next [-]

https://karpathy.ai/

PHD in neural networks under Fei-Fei Li, founder of OpenAI, director of AI at Tesla, etc. He knows what he's talking about.

▲

UncleMeat 6 minutes ago | parent | next [-]

I think this misses it a bit.

Andrej got famous because of his educational content. He's a smart dude but his research wasn't incredibly unique amongst his cohort at Stanford. He created publicly available educational content around ML that was high quality and got hugely popular. This is what made him a huge name in ML, which he then successfully leveraged into positions of substantial authority in his post-grad career.

He is a very effective communicator and has a lot of people listening to him. And while he is definitely more knowledgeable than most people, I don't think that he is uniquely capable of seeing the future of these technologies.

▲

password54321 2 hours ago | parent | prev | next [-]

>He knows what he's talking about.

https://en.wikipedia.org/wiki/Argument_from_authority

▲

onion2k 2 hours ago | parent | next [-]

While I appreciate an appeal to authority is a logical fallacy, you can't really use that to ignore everyone's experience and expertise. Sometimes people who have a huge amount of experience and knowledge on a subject do actually make a valid point, and their authority on the subject is enough to make them worth listening to.

▲

avaer 2 hours ago | parent [-]

But we're talking about authority of naming things being justified by a tech resume.

It's as irrelevant as George Foreman naming the grill.

	▲	onion2k 2 hours ago \| parent [-]
		Naming things in the context of AI, by someone who is already responsible for naming other things in the context of AI, when they have a lot of valid experience in the field of AI. It's not entirely unreasonable.

▲

wepple 2 hours ago | parent | prev [-]

https://en.wikipedia.org/wiki/Argument_from_fallacy

	▲	password54321 2 hours ago \| parent [-]
		Not claiming anything to be false, just a reminder that you should question ones opinion a bit more and not claim they "know what they are talking about" because they worked with Fei-Fei Li. You are outsourcing your thinking to someone else which is lazy and a good way of getting conned. What even happened to https://eurekalabs.ai/?

▲

William_BB an hour ago | parent | prev | next [-]

Oh, like the LLM OS?

▲

Der_Einzige 2 hours ago | parent | prev | next [-]

At one point he did. Cognitive atrophy has led him to decline just like everyone else.

▲

ahoka 2 hours ago | parent | prev [-]

Ex cathedra.

▲

rcore an hour ago | parent | prev | next [-]

Snake oil salesman.

▲

tokenless 2 hours ago | parent | prev | next [-]

Really smart AI guy ex Tesla, cum educator now cum vibe coder (he coined the term vibe coder)

▲

Aeolun 3 hours ago | parent | prev | next [-]

The person that made the svmjs library I used for a blue monday.

▲

jb1991 2 hours ago | parent | prev [-]

A quick Google might’ve saved you from the embarrassment of not knowing who one of the most significant AI pioneers in history is, and in a thread about AI too.

▲

UncleMeat 3 minutes ago | parent | next [-]

Andrej is an extremely effective communicator and educator. But I don't agree that he is one of the most significant AI pioneers in history. His research contributions are significant but not exceptional compared to other folks around him at the time. He got famous for free online courses, not his research. His work at Tesla was not exactly a rousing success.

Today I see him as a major influence in how people, especially tech people, think about AI tools. That's valuable. But I don't really think it makes him a pioneer.

▲

bravetraveler 2 hours ago | parent | prev [-]

I bet they feel so, so silly. A quick bit of reflection might reveal sarcasm.

I'll live up to my username and be terribly brave with a silly rhetorical question: why are we hearing about him through Simon? Don't answer, remember. Rhetorical. All the way up and down.

	▲	snayan an hour ago \| parent [-]
		Welp, would have been a more useful post if he provided some context as to why he feels contempt for Karpathy rather than a post that is likely to come across as the parent interpreted.

▲ Artoooooor an hour ago | parent | prev | next [-]

So now I will be able to tell OpenClaw to speedrun Captain Claw. Yeah.

▲ the_real_cher 2 hours ago | parent | prev | next [-]

What is the benefit of a Mac mini for something like this?

	▲	joshstrange an hour ago \| parent \| next [-]
		Just commented in reply to someone else about this: https://news.ycombinator.com/item?id=47099886
	▲	intrasight an hour ago \| parent \| prev \| next [-]
		It works and is plug and play. And can also work as a Mac. But getting in short supply since Apple hadn't planned for this new demand.
	▲	gostsamo 2 hours ago \| parent \| prev [-]
		Apple fans paying apple tax to have an isolated device accessing their profile.

▲ Artoooooor an hour ago | parent | prev | next [-]

So now the official name of the LLM agent orchestrator is claw? Interesting.

	▲	amelius 7 minutes ago \| parent [-]
		From https://openclaw.ai/blog/introducing-openclaw: The Naming Journey We’ve been through some names. Clawd was born in November 2025—a playful pun on “Claude” with a claw. It felt perfect until Anthropic’s legal team politely asked us to reconsider. Fair enough. Moltbot came next, chosen in a chaotic 5am Discord brainstorm with the community. Molting represents growth - lobsters shed their shells to become something bigger. It was meaningful, but it never quite rolled off the tongue. OpenClaw is where we land. And this time, we did our homework: trademark searches came back clear, domains have been purchased, migration code has been written. The name captures what this project has become: `Open: Open source, open to everyone, community-driven Claw: Our lobster heritage, a nod to where we came from`

▲ tovej an hour ago | parent | prev [-]

Ah yes, let's create an autonomic actor out of a nondeterministic system which can literally be hacked by giving it plaintext to read. Let's give that system access to important credentials letting it poop all over the internet.

Completely safe and normal software engineering practice.