▲ | simianwords 7 days ago | ||||||||||||||||||||||||||||||||||||||||
2.5 flash is particularly cheap and fast, I think 2.5 pro would have got all the answers correct - at least it gets this one correct. | |||||||||||||||||||||||||||||||||||||||||
▲ | Yokolos 7 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
I get a lot of garbage out of 2.5 Pro and Claude Sonnet and ChatGPT. There's always this "this is how you solve it", I take a close look and it's clearly broken, I point it out and it's all "you're right, this is a common issue". Okay, so why do we have to do this song and dance a million times to arrive at the actually correct answer? | |||||||||||||||||||||||||||||||||||||||||
▲ | kazinator 7 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
Why doesn't Flash get it correct, yet comes up with plausible sounding nonsense? That means it is trained on some texts in the area. What would make 2.5 Pro (or anything else) categorically better would be if it could say "I don't know". There will be things that Claude 3.7 or Gemini Pro will not know, and the interpolations they come up with will not make sense. | |||||||||||||||||||||||||||||||||||||||||
|