Remix.run Logo
jaen 4 hours ago

People building these Rube Goldberg contraptions: Do you actually run evaluations if this is any better at all than eg. giving it access to a Python REPL, or just toughing it out with random tools composed via shell scripts?

Why would an LLM be better trained to access Redis via a FS vs. a native library API?

Makes no sense.

sgbeal an hour ago | parent | next [-]

> Why would an LLM be better trained to access Redis via a FS vs. a native library API?

Limiting the potential blast radius.

If you give an agent "access to a Python REPL" (your words), you're giving it access to all of Python. i.e. you're paving the road to your own destruction when the agent goes awry. In the case of a Python interpreter, you're basically handing it an eight-lane highway upon which all sorts of pile-ups and havoc can happen.

By limiting its access to specific operations via well-defined endpoints (which is what the AGFS approach is), you're trimming that eight-lane highway back to a bicycle path.

dmos62 3 hours ago | parent | prev [-]

Why are you upset?