| ▲ | Stevvo 4 days ago | |||||||||||||
The variance is way too high for this test to have any value at all. I ran it 10 times, and each pelican on a bicycle was a better rendition than that, about half of them you could say were perfect. | ||||||||||||||
| ▲ | golly_ned 4 days ago | parent | next [-] | |||||||||||||
Compared to the other benchmarks which are much more gameable, I trust PelicanBikeEval way more. | ||||||||||||||
| ||||||||||||||
| ▲ | getnormality 4 days ago | parent | prev | next [-] | |||||||||||||
Well, the variance is itself interesting. | ||||||||||||||
| ▲ | throwaway102398 4 days ago | parent | prev [-] | |||||||||||||
[dead] | ||||||||||||||