Remix.run Logo
river_otter 10 hours ago

One thing from the podcast that jumped out to me was the statement that in pre training "you don't have to think closely about the data". Like I guess the success of pre training supports the point somewhat but it feels to me slightly opposed to Karpathy talking about what a large percentage of pretraining data is complete garbage. I guess I would hope that more work in cleaning the pre training data would result in stronger and more coherent base models.