| ▲ | vunderba 10 hours ago | |
I'd be curious about how well the inline verification works - an easy example is to have it generate a 9-pointed star, a classic example that many SOTA models have difficulties with. In the past, I've deliberately stuck a Vision-language model in a REPL with a loop running against generative models to try to have it verify/try again because of this exact issue. EDIT: Just tested it in Gemini - it either didn't use a VLM to actually look at the finished image or the VLM itself failed. Output:
Result: | ||