Remix clone Hacker News

new | show | ask | jobs Github

	▲	solenoid0937 3 hours ago
		They have not, every successful pre-train as of late has had performance increases greater than what the scaling laws predict.
	▲	0x3f 3 hours ago \| parent [-]
		Those gains are arch based, data quality based, etc. Scaling laws only relate to data volume and compute, holding other factors constant.