Remix.run Logo
numpad0 5 hours ago

I thought finetuning data can't contradict foundation models, and anything that are inconsistent with the standard LLM American-Chinese split personality would be rejected?

zozbot234 3 hours ago | parent [-]

Fine tuning happens on top of pretraining, so of course it can "forget" pretrained defaults when warranted by the new data it's being fine tuned on.

numpad0 2 hours ago | parent [-]

But you have to have more data than used for pretraining for the added knowledge to take precedent over pretraining, no? If that would be the case, you practically contradict the knowledge in the base model.

I mean ... LLMs are sort of an extreme and living proof of linguistic determinism. Their behaviors are dictated almost entirely by disorganized language data, primarily English and Chinese. So you can't just add a language as native primary language in a quick post training, I think. There's no way that it would work.