Remix.run Logo
Someone1234 4 days ago

Before this they supposedly had a longer context window than ChatGPT, but I have workloads that abuse the heck out of context windows (100-120K tokens). ChatGPT genuinely seems to have a 32K context window, in the sense that is legitimately remembers/can utilize everything within that window.

Claude previously had "200K" context windows, but during testing it wouldn't even hit a full 32K before hitting a wall/it forgetting earlier parts of the context. They also have extremely short prompt limits relative to the other services around, making it hard to utilize their supposedly larger context windows (which is suspicious).

I guess my point is that with Anthropic specifically, I don't trust their claims because that has been my personal experience. It would be nice if this "1M" context window now allows you to actually use 200K though, but it remains to be seen if it can even do that. As I said with Anthropic you need to verify everything they claim.

Etheryte 3 days ago | parent | next [-]

Strong agree, Claude is very quick to forget things like "don't do this", "never do this" or things it tried that were wrong. It will happily keep looping even in very short conversations, completely defeating the purpose of using it. It's easy to game the numbers, but it falls apart in the real world.

joquarky 3 days ago | parent [-]

I've found it better to use antonyms than negations most situations.

typpilol 3 days ago | parent [-]

Same here. Always tell them the way you want it done.

For example:

Instead of "don't modify the tests"

It should be: analyze the test output and fix the bug in the source code. The test is built correctly.

Not the best but you get the idea.

The one problem with this is if you don't know how to do something properly. Like if you're just writing in your prompt "generate 90% test coverage" , you give it a lot more leeway to do whatever it wants.

And that's how you end up with the source code being modified to fit the test vice versa

wahnfrieden 3 days ago | parent | prev [-]

ChatGPT Pro has a longer window but I’ve read conflicting reports on what it actually uses