Remix.run Logo
ahmedfromtunis 6 hours ago

Yeah. The new challenge seems easier to solve since it basically is hand-holding the LLMs into what the result should look like.

I think a more challenging, well, challenge, would be to offer an even more absurd scenario and see how the model handles it.

Example: generate an svg of a pelican and a mongoose eating popcorn inside a pyramid-shaped vehicle flying around Jupiter. Result: https://imgur.com/a/TBGYChc

simonw 5 hours ago | parent [-]

I like the hand-holding because it's a better test of how well models can follow more detailed instructions.

I was inspired by Max Woolf's nano banana test prompts: https://minimaxir.com/2025/11/nano-banana-prompts/

ahmedfromtunis 5 hours ago | parent [-]

That's a valid point but I'd argue the new test would be then interesting to couple with the original one, not to replace it.

Do you think it would be reasonable to include both in future reviews, at least for the sake of back-compatibility (and comparability)?