| ▲ | troupo 2 days ago |
| > Opus 4.5 really is at a new tier however. It just...works. Literally tried it yesterday. I didn't see a single difference with whatever model Claude Code was using two months ago. Same crippled context window. Same "I'll read 10 irrelevant lines from a file", same random changes etc. |
|
| ▲ | EMM_386 2 days ago | parent | next [-] |
| The context window isn't "crippled". Create a markdown document of your task (or use CLAUDE.md), put it in "plan mode" which allows Claude to use tool calls to ask questions before it generates the plan. When it finishes one part of the plan, have it create a another markdown document - "progress.md" or whatever with the whole plan and what is completed at that point. Type /clear (no more context window), tell Claude to read the two documents. Repeat until even a massive project is complete - with those 2 markdown documents and no context window issues. |
| |
| ▲ | troupo 2 days ago | parent [-] | | > The context window isn't "crippled". ... Proceeds to explain how it's crippled and all the workarounds you have to do to make it less crippled. | | |
| ▲ | EMM_386 2 days ago | parent [-] | | > ... Proceeds to explain how it's crippled and all the workarounds you have to do to make it less crippled. No - that's not what I did. You don't need an extra-long context full of irrelevant tokens. Claude doesn't need to see the code it implemented 40 steps ago in a working method from Phase 1 if it is on Phase 3 and not using that method. It doesn't need reasoning traces for things it already "thought" through. This other information is cluttering, not helpful. It is making signal to noise ratio worse. If Claude needs to know something it did in Phase 1 for Phase 4 it will put a note on it in the living markdown document to simply find it again when it needs it. | | |
| ▲ | troupo 2 days ago | parent [-] | | Again, you're basically explaining how Claude has a very short limited context and you have to implement multiple workarounds to "prevent cluttering". Aka: try to keep context as small as possible, restart context often, try and feed it only small relevant information. What I very succinctly called "crippled context" despite claims that Opus 4.5 is somehow "next tier". It's all the same techniques we've been using for over a year now. | | |
| ▲ | scotty79 2 days ago | parent [-] | | Context is a short term memory. Yours is even more limited and yet somehow you get by. | | |
| ▲ | troupo 2 days ago | parent [-] | | I get by because I also have long-term memory, and experience, and I can learn. LLMs have none of that, and every new session is rebuilding the world anew. And even my short-term memory is significantly larger than the at most 50% of the 200k-token context window that Claude has. It runs out of context before my short-term memory is probably not even 1% full, for the same task (and I'm capable of more context-switching in the meantime). And so even the "Opus 4.5 really is at a new tier" runs into the very same limitations all models have been running into since the beginning. | | |
| ▲ | scotty79 2 days ago | parent [-] | | > LLMs have none of that, and every new session is rebuilding the world anew. For LLMs long term memory is achieved by tooling. Which you discounted in your previous comments. You also overstimate capacity of your short-term memory by few orders of magnitude: https://my.clevelandclinic.org/health/articles/short-term-me... | | |
| ▲ | troupo 2 days ago | parent [-] | | > For LLMs long term memory is achieved by tooling. Which you discounted in your previous comments. My specific complaint, which is an observable fact about "Opus 4.5 is next tier": it has the same crippled context that degrades the quality of the model as soon as it fills 50%. EMM_386: no-no-no, it's not crippled. All you have to do is keep track across multiple files, clear out context often, feed very specific information not to overflow context. Me: so... it's crippled, and you need multiple workarounds scotty79: After all it's the same as your own short-term memory, and <some unspecified tooling (I guess those same files)> provide long-term memory for LLMs. Me: Your comparison is invalid because I can go have lunch, and come back to the problem at hand and continue where I left off. "Next tier Opus 4.5" will have to be fed the entire world from scratch after a context clear/compact/in a new session. Unless, of course, you meant to say that "next tier Opus model" only has 15-30 second short term memory, and needs to keep multiple notes around like the guy from Memento. Which... makes it crippled. | | |
| ▲ | scotty79 a day ago | parent [-] | | If you refuse to use what you call workarounds and I call long term memory then you end up with a guy from Memento and regardless of how smart the model is it can end up making same mistakes. And that's why you can't tell the difference between smarter and dumber one while others can. | | |
| ▲ | recursive a day ago | parent | next [-] | | I think the premise is that if it was the "next tier" than you wouldn't need to use these workarounds. | |
| ▲ | troupo a day ago | parent | prev [-] | | > If you refuse to use what you call workarounds Who said I refuse them? I evaluated the claim that Opus is somehow next tier/something different/amazeballs future at its face value. It still has all the same issues and needs all the same workarounds as whatever I was using two months ago (I had a bit of a coding hiatus between beginning of December and now). > then you end up with a guy from Memento and regardless of how smart the model is Those models are, and keep being the guy from memento. Your "long memory" is nothing but notes scribbled everywhere that you have to re-assemble every time. > And that's why you can't tell the difference between smarter and dumber one while others can. If it was "next tier smarter" it wouldn't need the exact same workarounds as the "dumber" models. You wouldn't compare the context to the 15-30 second short-term memory and need unspecified tools [1] to have "long-term memory". You wouldn't have the model behave in an indistinguishable way from a "dumber" model after half of its context windows has been filled. You wouldn't even think about context windows. And yet here we are [1] For each person these tools will be a different collection of magic incantations. From scattered .md files to slop like Beads to MCP servers providing access to various external storage solutions to custom shell scripts to ... BTW, I still find "superpowers" from https://github.com/obra/superpowers to be the single best improvement to Claude (and other providers) even if it's just another in a long serious of magic chants I've evaluated. | | |
| ▲ | scotty79 19 hours ago | parent [-] | | > Those models are, and keep being the guy from memento. Your "long memory" is nothing but notes scribbled everywhere that you have to re-assemble every time. That's exactly how the long term memory works in humans as well. The fact that some of these scribbles are done chemically in the same organ that does the processing doesn't make it much better. Human memories are reassembled at recall (often inaccurately). And humans also scribble when they try to solve a problem that exceeds their short term memory. > If it was "next tier smarter" it wouldn't need the exact same workarounds as the "dumber" models. This is akin to opposing calling processor next tier because it still needs RAM and bus to communicate with it and SSD as well. You think it should have everything in cache to be worthy of calling it next tier. It's fine to have your own standards for applying words. But expect further confusion and miscommunication with other people if don't intend to realign. | | |
| ▲ | troupo 17 hours ago | parent [-] | | > That's exactly how the long term memory works in humans as well. Where this is applicable when is you go away from a problem for a while. And yet I don't lose the entire context and have to rebuild it from scratch when I go for lunch, for example. Models have to rebuild the entire world from scratch for every small task. > This is akin to opposing calling processor next tier because it still needs RAM and bus to communicate with it and SSD as well. You're so lost in your own metaphor that it makes no sense. > You think it should have everything in cache to be worthy of calling it next tier. No. "Next tier" implies something significantly and observably better. I don't. And here you are trying to tell me "if you use all the exact same tools that you have already used before with 'previous tier models' you will see it is somehow next tier". If your "next tier" needs an equator-length list of caveats and all the same tools, it's not next tier is it? BTW. I'm literally coding with this "next tier" tool with "long memory just like people". After just doing the "plan/execute/write notes" bullshit incantations I had to correct it: You're right, I fucked up on all three counts:
1. FileDetails - I should have WIRED IT UP, not deleted it.
It's a useful feature to preview file details before playing.
I treated "unused" as "unwanted" instead of "not yet connected".
2. Worktree not merged - Complete oversight. Did all the work but
didn't finish the job.
3. _spacing - Lazy fix. Should have analyzed why it exists and either
used it or removed the layout constraint entirely.
So next tier. So long memory. So person-like.Oh. Within about 10 seconds after that it started compacting the "non-crippled" context window and immediately forgot most of what it had just been doing. So I had to clear out the context and teach it the world from the start again. Edit. And now this amazing next tier model completely ignored that there already exists code to discover network interfaces, and wrote bullshit code calling CLI tools from Rust. So once again it needed to be reminded of this. > It's fine to have your own standards for applying words. But expect further confusion and miscommunication with other people if don't intend to realign. I mean, just like crypto bros before them, AI bros do sure love to invent their own terminology and their own realities that have nothing to do with anything real and observable. | | |
| ▲ | scotty79 13 hours ago | parent [-] | | > "You're right, I fucked up on all three counts:" It very well might be that AI tools are not for you, if you are getting such poor results with your methods of approaching them. If you would like to improve your outcomes at some point, ask people who achieve better results for pointers and try them out. Here's a freebie, never tell AI it fucked up. |
|
|
|
|
|
|
|
|
|
|
|
|
|
| ▲ | llmslave2 2 days ago | parent | prev | next [-] |
| That's because Opus has been out for almost 5 months now lol. Its the same model, so I think people have been vibe coding with a heavy dose of wine this holiday and are now convinced its the future. |
| |
|
| ▲ | mikestorrent 2 days ago | parent | prev | next [-] |
| 200k+ tokens is a pretty big context window if you are feeding it the right context. Editors like Cursor are really good at indexing and curating context for you; perhaps it'd be worth trying something that does that better than Claude CLI does? |
| |
| ▲ | troupo 2 days ago | parent [-] | | > a pretty big context window if you are feeding it the right context. Yup. There's some magical "right context" that will fix all the problems. What is that right context? No idea, I guess I need to read a yet-another 20 000-word post describing magical incantations that you should or shouldn't do in the context. The "Opus 4.5 is something else/nex tier/just works" claims in my mind means that I wouldn't need to babysit its every decision, or that it would actually read relevant lines from relevant files etc. Nope. Exact same behaviors as whatever the previous model was. Oh, and that "200k tokens context window"? It's a lie. The quality quickly degrades as soon as Claude reaches somewhere around 50% of the context window. At 80+% it's nearly indistinguishable from a model from two years ago. (BTW, same for Codex/GPT with it's "1 million token window") | | |
| ▲ | theshrike79 2 days ago | parent | next [-] | | It's like working with humans: 1) define problem
2) split problem into small independently verifiable tasks
3) implement tasks one by one, verify with tools
With humans 1) is the spec, 2) is the Jira or whatever tasksWith an LLM usually 1) is just a markdown file, 2) is a markdown checklist, Github issues (which Claude can use with the `gh` cli) and every loop of 3 gets a fresh context, maybe the spec from step 1 and the relevant task information from 2 I haven't ran into context issues in a LONG time, and if I have it's usually been either intentional (it's a problem where compacting wont' hurt) or an error on my part. | | |
| ▲ | troupo 2 days ago | parent [-] | | > every loop of 3 gets a fresh context, maybe the spec from step 1 and the relevant task information from 2 > I haven't ran into context issues in a LONG time Because you've become the reverse centaur :) "a person who is serving as a squishy meat appendage for an uncaring machine." [1] You are very aware of the exact issues I'm talking about, and have trained yourself to do all the mechanical dance moves to avoid them. I do the same dances, that's why I'm pointing out that they are still necessary despite the claims of how model X/Y/Z are "next tier". [1] https://doctorow.medium.com/https-pluralistic-net-2025-12-05... | | |
| ▲ | theshrike79 2 days ago | parent [-] | | Yes and no. I've worked quite a bit with juniors, offshore consultants and just in companies where processes are a bit shit. The exact same method that worked for those happened to also work for LLMs, I didn't have to learn anything new or change much in my workflow. "Fix bug in FoobarComponent" is enough of a bug ticket for the 100x developer in your team with experience with that specific product, but bad for AI, juniors and offshored teams. Thus, giving enough context in each ticket to tell whoever is working on it where to look and a few ideas what might be the root cause and how to fix it is kinda second nature to me. Also my own brain is mostly neurospicy mush, so _I_ need to write the context to the tickets even if I'm the one on it a few weeks from now. Because now-me remembers things, two-weeks-from-now me most likely doesn't. | | |
| ▲ | troupo 2 days ago | parent [-] | | The problem with LLMs (similar to people :) ) is that you never really know what works. I've had Claude one-shot "implement <some complex requirement>" with little additional input, and then completely botch even the smallest bug fix with explicit instructions and context. And vice versa :) |
|
|
| |
| ▲ | CuriouslyC 2 days ago | parent | prev [-] | | I realize your experience has been frustrating. I hope you see that every generation of model and harness is converting more hold-outs. We're still a few years from hard diminishing returns assuming capital keeps flowing (and that's without any major new architectures which are likely) so you should be able to see how this is going to play out. It's in your interest to deal with your frustration and figure out how you can leverage the new tools to stay relevant (to the degree that you want to). Regarding the context window, Claude needs thinking turned up for long context accuracy, it's quite forgetful without thinking. | | |
| ▲ | th0ma5 2 days ago | parent | next [-] | | I think it's important for people who want to write a comment like this to understand how much this sounds like you're in a cult. | | |
| ▲ | CuriouslyC 2 days ago | parent | next [-] | | Personally I'm sympathetic to people who don't want to have to use AI, but I dislike it when they attack my use of AI as a skill issue. I'm quite certain the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful. | | |
| ▲ | troupo 2 days ago | parent [-] | | > but I dislike it when they attack my use of AI as a skill issue. No one attacked your use of AI. I explained my own experience with the "Claude Opus 4.5 is next tier". You barged in, ignored anything I said, and attacked my skills. > the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful. So what exactly is helpful in your comments? | | |
| ▲ | CuriouslyC 2 days ago | parent [-] | | The only thing I disagreed with in your post is your objectively incorrect statement regarding Claude's context behavior. Other than that I'm just trying to encourage you to make preparations for something that I don't think you're taking seriously enough yet. No need to get all worked up, it'll only reflect on you. |
|
| |
| ▲ | pigeons a day ago | parent | prev [-] | | It certainly sounds unkind, if not cultish. |
| |
| ▲ | troupo 2 days ago | parent | prev [-] | | Note how nothing in your comment addresses anything I said. Except the last sentence that basically confirms what I said. This perfectly illustrates the discourse around AI. As for the snide and patronizing "it's in your interest to stay relevant": 1. I use these tools daily. That's why I don't subscribe to willful wide-eyed gullibility. I know exactly what these tools can and cannot do. The vast majority of "AI skeptics" are the same. 2. In a few years when the world is awash in barely working incomprehensible AI slop my skills will be in great demand. Not because I'm an amazing developer (I'm not), but because I have experience separating wheat from the chaff | | |
| ▲ | CuriouslyC 2 days ago | parent [-] | | The snide and patronizing is your projection. It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming (technical merits aside, purely based on social dynamics). It seems the subject of AI is emotionally charged for you, so I expect friendly/rational discourse is going to be a challenge. I'd say something nice but since you're primed to see me being patronizing... Fuck you? That what you were expecting? | | |
| ▲ | troupo 2 days ago | parent [-] | | > The snide and patronizing is your projection. It's not me who decided to barge in, assume their opponent doesn't use something or doesn't want to use something, and offer unsolicited advice. > It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming See. Again. You're so in love with your "wisdom" that you can't even see what you sound like: snide, patronising, condenscending. And completely missing the whole point of what was written. You are literally the person who poisons the discourse. Me: "here are the issues I still experience with what people claim are 'next tier frontier model'" You: "it's in your interests to figure out how to leverage new tools to stay relevant in the future" Me: ... what the hell are you talking about? I'm using these tools daily. Do you have anything constructive to add to the discourse? > so I expect friendly/rational discourse is going to be a challenge. It's only challenge to you because you keep being in love with your voice and your voice only. Do you have anything to contribute to the actual rational discourse, are you going to attack my character? > 'd say something nice but since you're primed to see me being patronizing... Fuck you? T Ah. The famous friendly/rational discourse of "they attack my use of AI" (no one attacked you), "why don't you invest in learning tools to stay relevant in the future" (I literally use these tools daily, do you have anything useful to say?) and "fuck you" (well, same to you). > That what you were expecting? What I was expecting is responses to what I wrote, not you riding in on a high horse. | | |
| ▲ | CuriouslyC 2 days ago | parent | next [-] | | You were the one complaining about how the tools aren't giving you the results you expected. If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue. If you want to take politeness as being patronizing, I'm happy to stop bothering. My guess is you're not a special snowflake, and you need to "get good" or you're going to end up on unemployment complaining about how unfair life is. I'd have sympathy but you don't seem like a pleasant human being to interact with, so have fun! | | |
| ▲ | troupo 2 days ago | parent [-] | | > ou were the one complaining about how the tools aren't giving you the results you expected. They are not giving me the results people claim they give. It is distinctly different from not giving the results I want. > If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue. Indeed. And your rational/friendly discourse that you claim you're having would start with trying to figure that out. Did you? No, you didn't. You immediately assumed your opponent is a clueless idiot who is somehow against AI and is incapable or learning or something. > If you want to take politeness as being patronizing, I'm happy to stop bothering. No. It's not politeness. It's smugness. You literally started your interaction in this thread with a "git gud or else" and even managed to complain later that "you dislike it when they attack your use of AI as a skill issue". While continuously attacking others. > you don't seem like a pleasant human being to interact with Says the person who has contributed nothing to the conversation except his arrogance, smugness, holier-than-thou attitude, engaged in nothing but personal attacks, complained about non-existent grievances and when called out on this behavior completed his "friendly and rational discourse" with a "fuck you". Well, fuck you, too. Adieu. | | |
| |
| ▲ | cindyllm 2 days ago | parent | prev [-] | | [dead] |
|
|
|
|
|
|
|
| ▲ | pluralmonad a day ago | parent | prev | next [-] |
| I'm not familiar with any form of intelligence that does not suffer from a bloated context. If you want to try and improve your workflow, a good place to start is using sub-agents so individual task implementations do not fill up your top level agents context. I used to regularly have to compact and clear, but since using sub-agents for most direct tasks, I hardly do anymore. |
| |
| ▲ | troupo a day ago | parent [-] | | 1. It's a workaround for context limitations 2. It's the same workarounds we've been doing forever 3. It's indistinguishable from "clear context and re-feed the entire world of relevant info from scratch" we've had forever, just slightly more automated That's why I don't understand all the "it's new tier" etc. It's all the same issues with all the same workarounds. |
|
|
| ▲ | iwontberude 2 days ago | parent | prev [-] |
| I use Sonnet and Opus all the time and the differences are almost negligible |