Isn’t this just a form of the bitter lesson? Our attempts to make engineered context and agents will simply be made obsolete with bigger and better models. Those transcripts are probably extremely useful for lesser capable models, and near unnecessary for frontier ones, maybe?

▲

andai 6 hours ago | parent | next [-]

Yeah, the question is whether this applies to all of context management.

I've been using a custom harness based on https://minimal-agent.com/ (itself based on swe-mini-agent), which is like 50 lines for the core logic. Bash is all you need.

For small tasks, I find it's about 8x faster (and uses 8x fewer tokens) than the standard harness for each model.

For bigger tasks I haven't tested it much. It seems to work too but I think they're a bit less focused and productive in that case. It could be that those big harnesses' 20k token system prompts are doing something important with regard to steering software development workflows. (e.g. I heard Fable has a custom system prompt in Claude Code which might explain its markedly more proactive behavior.)

So I want to say there's still a lot of value in context engineering though it seems to diminish with each model release (since they're fine tuned on mostly non stupid behavior and need less hand holding).

	▲	sdesol 6 hours ago \| parent \| next [-]
		> So I want to say there's still a lot of value in context engineering though it seems to diminish with each model release I can't see how it would diminish unless you are literally working on public domain stuff. Unless stuffing context becomes cost effective and will not affect AI reasoning (this will be much harder), I don't see why context engineering is here to stay until we have close to AGI.
	▲	irthomasthomas 5 hours ago \| parent \| prev [-]
		In think in all cases where I've seen it compared CC performed worse than a minimal harness.

▲

theahura 5 hours ago | parent | prev | next [-]

interesting take. I think I disagree, but I like this take a lot and I had to think about it.

First, I think that models still need a context layer. One way to think about 'context' is as a form of compression. You provide the model context because it makes it easier for the model to figure out what to do. Even in a world with infinite model capacity and infinite model context, this is still useful because it allows the model to avoid rederiving everything from first principles every time. As long as models perform better using fewer tokens and as long as we care about token spend, context is a useful (necessary?) shortcut.

Once you bite that you need some form of context layer, the question is which. Here I do agree that it is better to work with what the models will find familiar (markdown files colocated with code, for eg). But this speaks to over-engineered solutions not understanding their main user (the agent) more than it does the need or lack there of.

▲

general_reveal 5 hours ago | parent [-]

A) Context and prompting cuts the search space for next token generation. That’s pretty useful, as you mentioned.

B) The other use of context is that it introduces entirely new information via RAG

B will never go away (as others pointed out). A, well that’s just something we’re all going to keep getting surprised at. We’ll barely give it any direction or context and the newer models will simply find the happy path.

The author is kind of suggesting that their context wasn’t really necessary to get the happy output, I think.

Chain of reasoning is a lot of context to guide token generation, but we simply see that newer models don’t need that context to get to the answer. I’m mostly reiterating this because there’s a hot take here, and that is this agentic stuff may be waived away by magic frontier-llm wand , all of a sudden.

▲

irthomasthomas 4 hours ago | parent | next [-]

>Chain of reasoning is a lot of context to guide token generation, but we simply see that newer models don’t need that context to get to the answer

I thought each new generation typically used more reasoning tokens?

	▲	general_reveal 3 hours ago \| parent [-]
		They do if they are a reasoning-variant. That doesn’t necessarily mean it actually needed to reason for many questions, your prompt + regular context could be enough to get a good answer compared to prior models where you’d absolutely have to put it into a reasoning-loop to get an accurate answer. It’s on by default, in a way. You can probably prompt these models with “and don’t reason about it, just give me the answer” and probably get a comparably good response without it using reasoning tokens for many things.

▲

theahura 5 hours ago | parent | prev [-]

(note that I am the author!)

▲

Xcelerate 6 hours ago | parent | prev | next [-]

I've wondered this. We have chain-of-thought, harnesses, etc. — workarounds of a sort due to lack of core model capabilities. But I am very curious if much better next token prediction would simply obsolete that whole setup or not. Either way, the answer would be very revealing.

▲

HarHarVeryFunny 6 hours ago | parent | prev [-]

I don't think so - I think we'll find that to build a brain you need more built-in structure and biases, not less.

Bear in mind that brain architecture is learnt too - just over a much longer timescale than an individual lifetime.