| ▲ | taspeotis 2 days ago | |
| ▲ | comboy 2 days ago | parent | next [-] | |
I think API is fine, likely only subscription is affected. Not to mention trivial heuristics to differentiate repeated API calls / same data and potential CLI usage although that would be true malice. It seemed to me that it was performing better through opencode using API but did not test extensively. | ||
| ▲ | chillacy 2 days ago | parent | prev [-] | |
If SWE Bench is public then Anthropic is at a minimum probably also looking at their SWE bench scores when making changes, I'd trust more a tracker which runs a private benchmark not known to Anthropic. | ||