Remix clone Hacker News

new | show | ask | jobs Github

	▲	doodlesdev 3 hours ago
		GPT 4o was also terrible at ARC AGI, but it's one of the most loved models of the last few years. Honestly, I'm a huge fan of the ARC AGI series of benchmarks, but I don't believe it corresponds directly to the types of qualities that most people assess whenever using LLMs.
	▲	nananana9 43 minutes ago \| parent \| next [-]
		It was terrible at a lot of things, it was beloved because when you say "I think I'm the reincarnation of Jesus Christ" it will tell you "You know what... I think I believe it! I genuinely think you're the kind of person that appears once every few millenia to reshape the world!"
	▲	mrybczyn an hour ago \| parent \| prev [-]
		because arc agi involves de novo reasoning over a restricted and (hopefully) unpretrained territory, in 2d space. not many people use LLMs as more than a better wikipedia,stack overflow, or autocomplete....