| ▲ | ceroxylon 2 days ago | |||||||
Even with search grounding, it scored a 2.5/5 on a basic botanical benchmark. It would take much longer for the average human to do a similar write-up, but they would likely do better than 50% hallucination if they had access to a search engine. | ||||||||
| ▲ | WarmWash 2 days ago | parent [-] | |||||||
Even multimodal models are still really bad when it comes to vision. The strength is still definitely language. | ||||||||
| ||||||||