Remix.run Logo
charcircuit 6 hours ago

>We have found zero performance benefit on SWE tasks when agents have search access to their previous transcript sessions

I refuse to believe this is true. The ability for an agent to find information from before a compaction is incredibly useful. At compaction time it's impossible to know what exactly may be still needed.

theahura 5 hours ago | parent | next [-]

With the million-context-window models we never hit compaction, observed over hundreds of sessions. What are you doing that has you hitting compaction regularly?

charcircuit 3 hours ago | parent [-]

For me logs can chew through a lot of tokens. And when the agent is trying a bunch of different experiments and then it may need to refer to what happened previously.

Million context models also are still not effective for the entire context size.

5 hours ago | parent | prev [-]
[deleted]