▲ | croemer 6 days ago | |||||||
Testing against unspecified other "leading" models allows for shenanigangs: > Qodo tested GPT‑4.1 head-to-head against other leading models [...] they found that GPT‑4.1 produced the better suggestion in 55% of cases The linked blog post goes 404: https://www.qodo.ai/blog/benchmarked-gpt-4-1/ | ||||||||
▲ | gs17 6 days ago | parent [-] | |||||||
The post seems to be up now and seems to compare it slightly favorable to Claude 3.7. | ||||||||
|