With four parameters I can fit an elephant, and with five I can make him wiggle his trunk so there is still room for improvement.

esafak 4 hours ago | parent | next [-]

Except learning to reason is a far cry from curve fitting. Our brains have more than five parameters.

	▲	sdenton4 an hour ago \| parent \| next [-]
		It's the statistics equivalent of 'no one needs more than 640kb of RAM'
	▲	voxelghost 3 hours ago \| parent \| prev \| next [-]
		After a quick content browse, my understanding is this is more like with a very compressed diff vector, applied to a multi billion parameter model, the models could be 'retrained' to reason (score) better on a specific topic , e.g. math was used in the paper
	▲	ekuck 3 hours ago \| parent \| prev \| next [-]
		speak for yourself!
	▲	est 3 hours ago \| parent \| prev \| next [-]
		reasoning capability might just be some specific combinations of mirror neurons. even some advanced math usually evolves applying patterns found elsewhere into new topics
	▲	measurablefunc 3 hours ago \| parent \| prev [-]
		I agree, I don't think gradient descent is going to work in the long run for the kind of luxurious & automated communist utopia the technocrats are promising everyone.

4 hours ago | parent | prev [-]

[deleted]