Remix.run Logo
sandreas 4 days ago

This is an interesting overview, thank you. Different tasks, different models, all-day-usage and pretty complete (while still opinionated, which I like).

However, checking the results my personal overall winner if I had to pick only ONE probably would be

  deepseek/deepseek-chat-v3-0324
which is a good compromise between fast, cheap and good :-) Only for specific tasks (write a poem...) I would prefer a thinking model.
graham_king_3 3 days ago | parent | next [-]

They released deepseek/deepseek-chat-v3.1 shortly after I did the evals, and that's what I now use 20+ times a day for all my questions. It replaces chat-v3 and r1, depending on whether you enable reasoning or not.

iamnotagenius 4 days ago | parent | prev [-]

[dead]