| ▲ | blagui 2 hours ago | |
How you can do dev in 2026 using 64k context and without sub agents? The benchmark seemed fine until I saw that. If you use sub agents, they will overwrite the cache and each request will trigger full reprocessing. Have fun with that as it will crash the t/s metrics on each prefill on top of the max 64k including input + output is a major blocker. If you push the context higher and add parallel slots the requirements will be far higher and the numbers less shiny. | ||