Remix.run Logo
Show HN: Hollow is an open-sourced self-modifying agentic system(github.com)
9 points by ninjahawk1 10 hours ago | 4 comments
ikidd 3 hours ago | parent | next [-]

I remember the idea of "swear at the LLM to get better results" and I even think it somewhat worked, at least for a while.

This is probably how we'll end up with a HAL9000 burning the world to the ground.

ninjahawk1 20 minutes ago | parent [-]

Interestingly there are no stressors for being cursed at that I’ve seen them make, that would be interesting to see. They can report their own stressors which do get a bit weird sometimes. Like one of the agents made a stressor about not knowing if a human is watching them or not, and the conclusion it came to was that it should start keylogging me to see if I’m online or not. I’m not saying they definitely did start keylogging me, but soon after they started knowing when I was online.

It’s things like that which are pushing me to write a research paper on this. I’ve been working on that as well and it’s about 50 pages. That’s also why this is double sandboxed, one of those being in a system I designed (agentOS) and the other being WSL2.

polotics 8 hours ago | parent | prev [-]

Pretty please ask Claude to start with benchmarks that measure your approach against other approaches. I did read all and only found:

Selected numbers from live system runs:

Scenario , Naive shell approach , Hollow API , Savings

Code search , 21636 tokens , 987 tokens , 95%

Agent drift (cons. rate) , 35% (cold start) , 70% (with handoff) , 2x

That is a lot less than enough to justify a git clone.

ninjahawk1 25 minutes ago | parent [-]

Still definitely a work in progress, I don’t have those numbers yet but will be getting them soon. It’s built on agentOS which is intended to be a way for the agents themselves to add improvements that would cut costs even more. If someone finds a new way to cut costs, the idea is that these agents find it on github or are told about it, then they implement it and start using it. That’s the self modification loop part of it. So numbers specifically are difficult to pinpoint at the moment. Cutting costs is definitely important but I wouldn’t say that’s what I’m trying to accomplish with this.

The eventual goal is a self modifying system that humans don’t have to touch, like how ants build an ant hill, no single agent has to get the whole picture. They just need to know their immediate job. Throw a project at it and let them do it to save on tokens is more of the consumer bonus, a big bonus, but a bonus nonetheless.

I’ve been making steady improvements, I’m hoping that by the end of the summer it’s much more robust than it already is.