| ▲ | Wowfunhappy 6 hours ago | |||||||||||||||||||||||||
Aww, I don’t like the new pelican benchmark as much. I liked that the old prompt was vague and we could see how the AI interpreted it. | ||||||||||||||||||||||||||
| ▲ | ahmedfromtunis 5 hours ago | parent [-] | |||||||||||||||||||||||||
Yeah. The new challenge seems easier to solve since it basically is hand-holding the LLMs into what the result should look like. I think a more challenging, well, challenge, would be to offer an even more absurd scenario and see how the model handles it. Example: generate an svg of a pelican and a mongoose eating popcorn inside a pyramid-shaped vehicle flying around Jupiter. Result: https://imgur.com/a/TBGYChc | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||