| ▲ | visioninmyblood 10 hours ago | |
I agree claude and chatgpt and even gemini does a poor job in detecting and cropping into a region. Some of the simplest tasks, Qwen also is great at summerization but not into solving simple vision tasks like cropping, segmentetation and detection. Here is an examples where we compared claude, gemini, chatgpt and other frontier models for simple(and complicated) visual tasks https://chat.vlm.run/showdown#:~:text=Crop%20into%20the%20cl... | ||
| ▲ | colechristensen 10 hours ago | parent [-] | |
The part that was funny to me is I would respond "is that right?" and it would tell me exactly how it was wrong and proceed to do it incorrectly again in a very similar but different way. It was like a Monty Python sketch. I might have also been very tired and easily amused. | ||