>We have found zero performance benefit on SWE tasks when agents have search access to their previous transcript sessions

I refuse to believe this is true. The ability for an agent to find information from before a compaction is incredibly useful. At compaction time it's impossible to know what exactly may be still needed.

▲

theahura 5 hours ago | parent | next [-]

With the million-context-window models we never hit compaction, observed over hundreds of sessions. What are you doing that has you hitting compaction regularly?

	▲	charcircuit 3 hours ago \| parent [-]
		For me logs can chew through a lot of tokens. And when the agent is trying a bunch of different experiments and then it may need to refer to what happened previously. Million context models also are still not effective for the entire context size.

▲

5 hours ago | parent | prev [-]

[deleted]