| ▲ | prodigycorp 3 hours ago | |
after taking a walk for a bit i decided you’re right. I came to the wrong conclusion. Gemini 3 is incredibly powerful in some other stuff I’ve run. This probably means my test is a little too niche. The fact that it didn’t pass one of my tests doesn’t speak to the broader intelligence of the model per se. While i still believe in the importance of a personalized suite of benchmarks, my python one needs to be down weighted or supplanted. my bad to the google team for the cursory brush off. | ||
| ▲ | chermi an hour ago | parent [-] | |
Walks are magical. But also this reads partially like you got sent to a reeducation camp lol. | ||