Remix.run Logo
stared 5 hours ago

I don’t get this OpenClaw hype.

When people vibe-code, usually the goal is to do something.

When I hear people using OpenClaw, usually the goal seems to be… using OpenClaw. At a cost of a Mac Mini, safety (deleting emails or so), and security (litelmm attack).

eloisant 5 hours ago | parent | next [-]

The idea is to get a virtual personal assistant. Like Siri or Gemini but with access to all of your accounts, computers, etc. (Well whatever you give it access to). Like having a butler with access to your laptop.

From what I understand, the main appeal isn't the end result, but building that AI personal assistant as a hobby is the appeal.

valeena 3 hours ago | parent [-]

With a goal like this I could, at least on paper, find it useful... But I'm curious to see if this goal is really achievable, or if it easily is

Gareth321 2 hours ago | parent [-]

That is my goal and I invested a few dozen hours into the endeavour. My honest review is:

1. Something like OpenClaw will change the world.

2. OpenClaw is not yet ready.

The heart of OpenClaw (and the promise) is the autonomy. We can already do a lot with the paid harnesses offered by OpenAI and Anthropic, so the secret sauce here is agents doing stuff for us without us having to babysit them or even ask them.

The problem is that OpenClaw does this is an extreme rudimentary way: with "heartbeats." These are basically cron jobs which execute every five minutes. The cron job executes a list of tasks, which in turn execute other tasks. The architecture is extremely inefficient, heavy in LLM compute, and prone to failure. I could enumerate the thousand ways it can and will fail but it's not important. So the autonomy part of the autonomous assistant works very badly. Many people end up with a series of prescriptive cron jobs and mistakenly call that OpenClaw.

Compounding this is memory. It is extremely primitive. Unfortunately even the most advanced RAG solutions out there are poor. LLMs are powerful due to the calculated weights between parametric knowledge. Referring to non-parametric knowledge is incredibly inefficient. The difference between a wheelchair and a rocket ship. This compounds over time. Each time OpenClaw needs to "think" about anything, it preloads a huge amount of "memories" into the query. Everything from your personal details to architecture to the specific task. Something as simple as "what time is it" can chew through tens of thousands of tokens. Now consider what happens over time as the agent learns more and more about you. Does that all get included in every single query? It eventually fails under its own weight.

There is no elegant solution to this. You can "compress" previous knowledge but this is very lossy and the LLMs do a terrible job of intelligently retaining the right stuff. RAG solutions are testing intelligent routing. One method is an agentic memory feedback loop to seek out knowledge which might exist. The problem is this is circular and mathematically impossible. Does the LLM always attempt to search every memory file in the hope that one of the .md files contains something useful? This is hopelessly slow. Does it try to infer based on weekly/monthly summaries? This has proven extremely error-prone.

At this point I think this will be first solved by OpenAI and/or Anthropic. They'll create a clean vectorised memory solution (likely a light LLM which can train itself in the background on a schedule) and a sustainable heartbeat cadence packaged into their existing apps. Anthropic is clearly taking cues from OpenClaw right now. In a couple of years we might have a competent open source agent solution. By then we might also have decent local LLMs to give us some privacy, because sending all my most intimate info to OpenAI doesn't feel great.

feigewalnuss an hour ago | parent | next [-]

Disclosure: I wrote the linked post.

Heartbeat cron and naive memory are the right thread to pull. Agree.

The problem is the data/trust boundary. One agent process, one credential store, all channels sharing both. Whenever we scale the memory up, which we all want to do, we scale the disaster radius of every prompt injection with it.

Wirken accounted for this in the first design step. Per-channel process isolation. Handshakes between adapters and the core. Compile-time type constraints so a Discord adapter cannot construct a Telegram session handle. Encrypted credential vault. Hash-chained audit log of every action. All, remaining model-agnostic, so local models and confidential-compute providers are drop-in.

Your memory point is still unsolved at this layer. When memory does get solved, you want the solver running where it cannot leak the wrong credentials to the wrong channel. Otherwise the smarter it gets, the worse the breach.

zozbot234 an hour ago | parent | prev [-]

> Compounding this is memory. It is extremely primitive. ... Now consider what happens over time as the agent learns more and more about you. Does that all get included in every single query? It eventually fails under its own weight.

Agentic coding has all of the same issues and it gets solved much the same way: give LLMs tool calls to file persistent memories by topic, list what topics are available (possibly with multiple levels of subtopics in turn) and retrieve them into the context when relevant. Not too different from what humans do with zettelkasten and the like.

d0gsg0w00f 5 hours ago | parent | prev | next [-]

I have OC on a VPS. So far it's a way for me to play with non-Claude models and try to get them to get OC under control. So far I'm about $200 all in and OC is still not under control. Every few weeks it goes on an ACP bender and blows my credits in hidden sub-agents for no damn reason. I'm determined to break this horse though, it's like a fun video game with a glitchy end boss.

valeena 4 hours ago | parent [-]

For how long have you been using it for it to have consumed $200? For me it sounds like a lot (still a student) but it doesn't seem to be the same for you

Someone 4 hours ago | parent | prev | next [-]

In the early 1980’s, what did people use home computers such as Atari’s and Commodore 64’s for? Mostly playing games; nerds also used their computer with the goal seeming to be… using their computer.

It wasn’t (only) that, though; they also learned, so that, when people could afford to buy computers that were really useful, there were people who could write useful programs, administer them, etc.

Same thing with 3D printers a decade or so ago. What did people use them for? Mostly tinkering with hard- and software for days to finally get them to print some teapot or rabbit they didn’t need or another 3D printer.

This _may_ be similar, with OpenClaw-like setups eventually getting really useful and safe enough for mere mortals.

But yes, the risks are way larger than in those cases.

Also, I think there are safer ways to gain the necessary expertise.

enoint an hour ago | parent [-]

Back in the early 80s, some people used home computers as seriously as they used work computers:

- organize follow-up reminders for business calls. Automate a modem-based upload.

- crunch investment options in commodities. Not in an econometric way, but a table listing which analyst said what and which analyst was silent. Automate a modem-based upload.

So, with regard to the article, we can presume the author did claw-like things with DOS. As he aged, now he probably needs to organize many trips to doctors and specialists. Who is doing all that administration for your older folks?

SlinkyOnStairs 5 hours ago | parent | prev | next [-]

The main "sales pitch" appears to be "You can have the computer do things for you without having to learn how to use a computer" (at the cost of now having to learn how to use a massively overcomplicated and fundamentally unreliable system; It's just an illusion of ease of use.)

The thread's linked article is about comparing MS-DOS' security, but the comparison works on another level as well: I remember MS-DOS. When the very idea of the home/office computer was new. When regular people learned how to use these computers.

All this pretension that computers are "hard to use", that LLMs are making the impossible possible, it's all ahistoric nonsense. "It would've taken me months!" no, you would've just had to spend a day or two learning the basics of python.

stared 5 hours ago | parent [-]

I was one of those using MS-DOS (still I remember blue Norton Commander). I didn't understand people mocking it later - as it just worked. Enough to run the Prince of Persia, Doom or so. Or edit text files. (As an excuse, I was just ~7 yo back then.)

Havoc 4 hours ago | parent | prev | next [-]

It’s basically a reimagined n8n like low code platform with LLM magic. Digital glue

That’s why there isn’t a coherent use story because like glue the answer is whatever the user needs to glue/get done

leonidasrup 5 hours ago | parent | prev | next [-]

OpenClaw, the ultimate arbitrary code execution

classified 4 hours ago | parent [-]

Didn't you always want to let everyone else do remote code execution on your computer?

5 hours ago | parent | prev | next [-]
[deleted]
thenthenthen 5 hours ago | parent | prev [-]

To me openclaw sounds like a software clickfarm?