| ▲ | cbg0 3 hours ago | |||||||||||||
One of the things I'm always looking at with new models released is long context performance, and based on the system card it seems like they've cracked it: | ||||||||||||||
| ▲ | 2 hours ago | parent | next [-] | |||||||||||||
| [deleted] | ||||||||||||||
| ▲ | metadat 3 hours ago | parent | prev | next [-] | |||||||||||||
Data source: https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89... (Search for “graphwalk”.) If true, the SWE bench performance looks like a major upgrade. | ||||||||||||||
| ▲ | himata4113 3 hours ago | parent | prev | next [-] | |||||||||||||
this seems to be similar to gpt-pro, they just have a very large attention window (which is why it's so expensive to run) true attention window of most models is 8096 tokens. | ||||||||||||||
| ||||||||||||||
| ▲ | frog437 3 hours ago | parent | prev [-] | |||||||||||||
[flagged] | ||||||||||||||