| ▲ | shepherdjerred 2 hours ago | |||||||
> and there’s zero chance any AI lab would train a model for such a ridiculous task. I'm not sure that's true anymore considering how popular Simon's blog is | ||||||||
| ▲ | _puk an hour ago | parent | next [-] | |||||||
> So maybe the AI labs have been paying attention after all! > I think this mainly demonstrates that the pelican on the bicycle has firmly exceeded its limits as a useful benchmark. As acknowledged in the article. | ||||||||
| ||||||||
| ▲ | simonw 36 minutes ago | parent | prev | next [-] | |||||||
That bit probably works better in the talk, it was a setup for a joke later on. | ||||||||
| ▲ | nickvec an hour ago | parent | prev [-] | |||||||
Simon mentions further along in his article that given Jeff Dean’s post referencing the pelican-riding-a-bike task (and how good current models are at doing it), that it’s no longer a great benchmark to use. Enter the opossum riding an e-scooter! | ||||||||