▲ | bla3 7 days ago | |
Why do Hunyuan, OpenAI 4o and Gwen get a pass for the octopus test? They don't cover "each tentacle", just some. And midjourney covers 9 of 8 arms with sock puppets. | ||
▲ | vunderba 7 days ago | parent [-] | |
Good point. I probably need to adjust the success pass ratios to be a bit stricter, especially as the models get better. > midjourney covers 9 of 8 arms with sock puppets. Midjourney is shown as a fail so I'm not sure what your point is. And those don't even look remotely close to sock puppets, they resemble stockings at best. |