| ▲ | dominotw 5 hours ago | |||||||||||||
i dont understand the nuances here. what does this mean. 4.8 is trained on same model as previous one then? what does brand new mean. | ||||||||||||||
| ▲ | irthomasthomas 5 hours ago | parent [-] | |||||||||||||
It means for 4.7 they trained a new base model with different architecture, different pre-training data (later knowledge cutoff), and a new tokenizer. Vs finetuning an existing model, which was the case for 4.6, and probably for 4.8. | ||||||||||||||
| ||||||||||||||