| ▲ | ACCount37 3 days ago | |||||||
Data filtering. Dataset curation. Curriculum learning. All already in use. It's not sexy, it's not a breakthrough, but it does help. | ||||||||
| ▲ | Havoc 3 days ago | parent | next [-] | |||||||
> All already in use. At the big labs that makes sense. Bit more puzzled by why it isn’t used in the toy projects. Certainly more complexity but seems like it would make a big difference | ||||||||
| ▲ | famouswaffles 2 days ago | parent | prev [-] | |||||||
Curriculum learning is not really a thing for these large SOTA LLM training runs (specifically pre-training). We know it would help, but ordering trillions of tokens of data in this way would be a herculean task. | ||||||||
| ||||||||