Remix.run Logo
machinecontrol 7 hours ago

The trend is obviously towards larger and larger context windows. We moved from 200K to 1M tokens being standard just this year.

This might be a complete non issue in 6 months.

hrmtst93837 5 hours ago | parent | next [-]

Those bigger windows come with lovely surcharges on compute, latency, and prompt complexity, so "just wait for more tokens" is a nice fantasy that melts the moment someone has to pay the bill. If your use case is tiny or your budget is infinite, fine, but for everyone else the "make the window bigger" crowd sounds like they're budgeting by credit card. Quality still falls off near the edge.

amzil 6 hours ago | parent | prev [-]

Context windows getting bigger doesn't make the economics go away. Tokens still cost money. 50K tokens of schemas at 1M context is the same dollar cost as 50K tokens at 200K context, you just have more room left over.

The pattern with every resource expansion is the same: usage scales to fill it. Bigger windows mean more integrations connected, not leaner ones. Progressive disclosure is cheaper at any window size.

magospietato 6 hours ago | parent [-]

Context caching deals with a lot of the cost argument here.

amzil 6 hours ago | parent [-]

It helps with cost, agreed. But caching doesn't fix the other two problems.

1) Models get worse at reasoning as context fills up, cached or not. right? 2) Usage expansion problem still holds. Cheaper context means teams connect more services, not fewer. You cache 50K tokens of schemas today, then it's 200K tomorrow because you can "afford" it now. The bloat scales with the budget...

Caching makes MCP more viable. It doesn't make loading 43 tool definitions for a task that uses two of them a good architecture.