| ▲ | th0ma5 a day ago | |||||||
I personally don't understand why asking these things to do things we know they can't do is supposed to be productive. Maybe for getting around restrictions or fuzzing... I don't see it as an effective benchmark unless it can link directly to the ways the models are being improved, but, to look at random results that sometimes are valid and think more iterations of randomness will eventually give way to control is a maddening perspective to me, but perhaps I need better language to describe this. | ||||||||
| ▲ | thecr0w a day ago | parent [-] | |||||||
I think this is a reasonable take. I think for me, I like to investigate limitations like this in order to understand where the boundaries are. Claude isn't impossibly bad at analyzing images. It's just pixel perfect corrections that seem to be a limitation. Maybe for some folks it's enough to just read that but for me, I like to feel like I have some good experiential knowledge about the limitations that I can keep in my brain and apply appropriately in the future. | ||||||||
| ||||||||