▲ | deepsquirrelnet a day ago | |||||||
This is pretty cool. I have a similar model that’s 8 days into training on msmarco. So far I only have the “cold start” data posted, but I’m planning on posting a full distillation dataset. | ||||||||
▲ | jacobgorm a day ago | parent [-] | |||||||
What kind of hardware setup would be needed to replicate the paper’s results? | ||||||||
|