| ▲ | HarHarVeryFunny 3 hours ago | |
I'm curious how you are testing/trying these latest models? Do you have specific test/benchmark tasks that they struggle with that you are trying, and/or are you working on a real project and just trying alternatives where another model is not performing well ? | ||
| ▲ | koakuma-chan 3 hours ago | parent [-] | |
I am using Cursor. It has all major models—OpenAI, Anthropic, Google, etc. Every time a new model comes out, I test it on a real project (the app that I am working on at work). | ||