| ▲ | Agent design is still hard(lucumr.pocoo.org) | ||||||||||||||||||||||
| 60 points by the_mitsuhiko 3 hours ago | 15 comments | |||||||||||||||||||||||
| ▲ | mritchie712 28 minutes ago | parent | next [-] | ||||||||||||||||||||||
Some things we've[0] learned on agent design: 1. If your agent needs to write a lot of code, it's really hard to beat Claude Code (cc) / Agent SDK. We've tried many approaches and frameworks over the past 2 years (e.g. PydanticAI), but using cc is the first that has felt magic. 2. Vendor lock-in is a risk, but the bigger risk is having an agent that is less capable then what a user gets out of chatgpt because you're hand rolling every aspect of your agent. 3. cc is incredibly self aware. When you ask cc how to do something in cc, it instantly nails it. If you ask cc how to do something in framework xyz, it will take much more effort. 4. Give your agent a computer to use. We use e2b.dev, but Modal is great too. When the agent has a computer, it makes many complex features feel simple. 0 - For context, Definite (https://www.definite.app/) is a data platform with agents to operate it. It's like Heroku for data with a staff of AI data engineers and analysts. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | postalcoder an hour ago | parent | prev | next [-] | ||||||||||||||||||||||
I've been building agent type stuff for a couple years now and the best thing I did was build my own framework and abstractions that I know like the back of my hand. I'd stay clear of any llm abstraction. There are so many companies with open source abstractions offering the panacea of a single interface that are crumbling under their own weight due to the sheer futility of supporting every permutation of every SDK evolution, all while the same companies try to build revenue generating businesses on top of them. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | _pdp_ 6 minutes ago | parent | prev | next [-] | ||||||||||||||||||||||
I've started a company in this space about 2 years ago. We are doing fine. What we've learned so far is that a lot of these techniques are simply optimisations to tackle some deficiency in LLMs that is a problem "today". These are not going to be problems tomorrow because the technology will shift. As it happened many time in the span of the last 2 years. So yah, cool, caching all of that... but give it a couple of months and a better technique will come out - or more capable models. Many years ago when disc encryption on AWS was not an option, my team and I had to spend 3 months to come up with a way to encrypt the discs and do so well because at the time there was no standard way. It was very difficult as that required pushing encrypted images (as far as I remember). Soon after we started, AWS introduced standard disc encryption that you can turn on by clicking a button. We wasted 3 months for nothing. We should have waited! What I've learned from this is that often times it is better to do absolutely nothing. | |||||||||||||||||||||||
| ▲ | CuriouslyC 19 minutes ago | parent | prev | next [-] | ||||||||||||||||||||||
The 'Reinforcement in the Agent Loop' section is a big deal, I use this pattern to enable async/event steered agents, it's super powerful. In long context you can use it to re-inject key points ("reminders"), etc. | |||||||||||||||||||||||
| ▲ | srameshc 17 minutes ago | parent | prev | next [-] | ||||||||||||||||||||||
I still feel there is no sure shot way to build an abstraction yet. Probably that is why Loveable decided to build on Gemini AI rather than giving options of choosing model. On the other hand I like Pydantic AI framework and got myself a decent working solution where my preference is to stick with cheaper models by default and only use expensive only in cases where failure rate is too high. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | Fiveplus 21 minutes ago | parent | prev [-] | ||||||||||||||||||||||
I liked reading this but got a silly question as I am a noob in these things. If explicit caching is better, does that mean the agent is just forgetting stuff unless we manually save its notes? Are these things really that forgetful? Also why is there a virtual file system? So the agent is basically just running around a tiny digital desktop looking for its files? Why can't the agent just know where the data is? I'm sorry if these are juvenile questions. | |||||||||||||||||||||||
| |||||||||||||||||||||||