| ▲ | cyanydeez 3 hours ago | |
I don't really think you're making reasonable decisions at that size; but I suppose if you're not allowed to refactor it, maybe. I think the way these models work excludes sane behaviors the larger the context gets as each token introduces potential ambiguities between "USER" and "SYSTEM" messages leading to all the catastrophic behaviors. Anyway, with AMD395+ I'm finding ~100k is both speed and context usefulness unless it's scoped tightly. with opencode, I manage it with dynamic context pruning: https://github.com/Opencode-DCP/opencode-dynamic-context-pru... ; then anything I touch ends up being refactored so context doesn't get bloated with unecessary functions, etc. Obviously, this isn't compatible with certain business codebases, so I can see why bloat meets bloat. | ||