| ▲ | Ifkaluva 4 hours ago | |||||||
Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model | ||||||||
| ▲ | andai 3 hours ago | parent [-] | |||||||
What's the downside? Don't they stop when they hit diminishing returns? | ||||||||
| ||||||||