Remix.run Logo
dwa3592 6 hours ago

Understood, but they could fine tune base models on their own cultural context and language. Why reinventing the wheel?

numpad0 5 hours ago | parent | next [-]

I thought finetuning data can't contradict foundation models, and anything that are inconsistent with the standard LLM American-Chinese split personality would be rejected?

zozbot234 3 hours ago | parent [-]

Fine tuning happens on top of pretraining, so of course it can "forget" pretrained defaults when warranted by the new data it's being fine tuned on.

numpad0 2 hours ago | parent [-]

But you have to have more data than used for pretraining for the added knowledge to take precedent over pretraining, no? If that would be the case, you practically contradict the knowledge in the base model.

I mean ... LLMs are sort of an extreme and living proof of linguistic determinism. Their behaviors are dictated almost entirely by disorganized language data, primarily English and Chinese. So you can't just add a language as native primary language in a quick post training, I think. There's no way that it would work.

DonHopkins 6 hours ago | parent | prev | next [-]

They could apply the Polder Model of consensus decision making with a mixture of experts.

https://en.wikipedia.org/wiki/Polder_model

nehal3m 5 hours ago | parent [-]

Funny, that's what I thought when PewDiePie set up his monster AI rig and what he called a 'council'. Quote:

"PewDiePie has built a custom web UI for self-hosting AI models called "ChatOS" that runs on his custom PC with 2x RTX 4000 Ada cards, along with 8x modded RTX 4090s with 48 GB of VRAM. Running open-source models from Baidu and OpenAI, PewDiePie made a "council" of bots that voted on the best responses, and then built "The Swarm" for data collection that will become the foundation of his own model coming next month."

https://www.tomshardware.com/tech-industry/artificial-intell...

applfanboysbgon 6 hours ago | parent | prev [-]

This gets better short-term results for a fraction of the cost, for sure, but what do you when China places an export control banning the release of open weight models? If you don't have your own talent, you're then relegated to using a base model from 2026 or whatever the cutoff date is, forever. That defeats the purpose of a 'sovereign' model made for and by your people.