I feel like what is needed is not compression, but aggressive context management with subagents.

so burn more tokens to save more tokens, so that we can spend more on X token but save on Y tokens?

not the question is which X tokens and which Y tokens? and since the output is non-deterministic how do you validate this?

LLMs aren't random and that enforces something that people are too dumb to realize that random-ness could be normally distributed but LLMs have no reason to be normally distributed or follow any sort of curve of understanding.

They are non-deterministic but with bias so their output might be just be worse with T' transformation for the class of problems A is solving but work great for B. or vice versa.

You can't reproducibly test LLMs and that allows all sorts of benchmarks to exist which can make any model look good or bad as much as we want. Enlightening stuff.

Not much different from sociological or psychological sciences where with enough bias in data you can prove anything.

▲

lackoftactics 8 hours ago | parent | prev | next [-]

I am the author the text.

What do you mean by aggresive context management with subagents? Would you add a lopp that would trim the context?

Both of those tasks seem even more difficult

▲

SubiculumCode 7 hours ago | parent | next [-]

First, I only say this because of what I learned as a phD inhuman memory, not as someone who authors agentic workflows or does AI.

How human cognition tends to work by simultaneously utilizing and combining/separating multiple frequency scales of information. A simple way of thinking about is this: We tend to encode and retrieve both the gist of what is happening, and the verbatim details of what happened. The gist can be thought of as low frequency information, almost like bullet points, that contain the big overview goal, keypoints). The verbatim traces, are the high resolution memory that contains all the details. The gist helps encoding and recall by providing encoding and retrieval context cues. There are also levels in between those two, but I was keeping it simple. During human development, verbatim memory capacity increases first, but then hits a wall/plateau. Further performance increases begin to depend on the ability to utilize and gain from gist-like representations that can guide encoding and retrieval of verbatim details within contexts.

You don't need to keep everything in the context window. My untested, perhaps naive hypothesis is that what is needed is that sub-agents dealing with verbatim tasks (actually writing code), their context window should be managed by an agent above that is tuned to information at a lower frequency, and it by another above it on even lower frequency information. Lowest frequency information context windows feel up slowly. High-frequency information fills up fast. Use the low frequency information to retrieve the needed high frequency information.

▲

skinfaxi 7 hours ago | parent | prev [-]

I believe they mean aggressive delegation to minimize context bloat in the coordinating agent.

	▲	svachalek 5 hours ago \| parent \| next [-]
		This is a really useful technique in my experience. The harnesses are starting to do it more on their own but if you encourage the use of more subagents, I find it's typically nothing but win.
	▲	lackoftactics 7 hours ago \| parent \| prev [-]
		that would make more sense, trimming context with subagents sounds like an overkill

▲

arcanemachiner 7 hours ago | parent | prev [-]

Use the right tool for the job.

If you need a piece of information that is buried somewhere, or a high-level summary/distillation of a larger body of info, then subagents may be the right tool for the job.

If you need all the gathered context for later use (i.e. distilled context is insufficient), then subagents probably are not the right tool for the job.

▲

cyanydeez 7 hours ago | parent [-]

if your corebase requires a million tokens, then youre probably going to break more than you fix

	▲	arcanemachiner 7 hours ago \| parent [-]
		If you are using a million tokens in a single context window, you are using the entire toolbox incorrectly.