Remix clone Hacker News

new | show | ask | jobs Github

	▲	cedws 5 hours ago
		It’s not a waste of time. As the boundaries of AI are pushed we increasingly struggle to define what intelligence actually is. It becomes more useful to test what models cannot do instead of what they can. Random tasks like the pelican test can show how general the intelligence really is, putting aside the obvious flaw that the labs can optimise for such a simple public benchmark.