Remix.run Logo
spwa4 7 hours ago

Exactly, if there's one thing transformers are good at it's translation. One I've found particularly nice: any question ChatGPT can answer in English it can answer in French. I'm assuming Norwegian too. So there's no point.

sgt 7 hours ago | parent | next [-]

There's quite a bit more to culture and language than just being able to have transformers come up with believable language and/or dialect.

sisve 6 hours ago | parent | prev | next [-]

The point is that norway willl have its own LLM. And will not have dependencies to another state or private company. The goal is not to be the best model. But to have a model that include more Norwegian data then other LLM and that it's not screwed against other sources.

dalemhurley 6 hours ago | parent | prev | next [-]

Yes transformers are great at translation as that is their purpose.

LLMs are not great at preserving cultural uniqueness and diversity. Take how “delve” has reentered the lexicon because the human assessors for pre training dialect of English uses “delve” a lot.

There is a lot of benefits to training specifically for a unique culture with unique norms to preserve the culture as we increasingly rely on LLMs.

https://www.scientificamerican.com/article/chatgpt-is-changi...

otabdeveloper4 6 hours ago | parent | prev | next [-]

They're only good at it because they were trained on massive amounts of English and French data.

vidarh 5 hours ago | parent [-]

Not really true.

Both Claude and ChatGPT can translate into minor dialects of Norwegian they will have seen very few works in because very few printed works exist in them.

E.g. I've tested both my local spoken dialect, which is rarely written, and a sociolect used by a 1970's Maoist group consiting of a few hundred people, where most of the printed material consists of novels from a couple of ex-members that became authors.

In the latter case, it claimed to not know, but was able to get a good match from just a description.

I also just had it ape Norwegian orthography from the 1910's by having it look up the rules and translate a text it had first translated from English to modern Norwegian, and it did just fine.

They will have seem some work in these dialects, but mostly it transfer really well to know related languages (English, Dutch, German, Swedish, Danish, roughly form a continuum from least in common to most in common with modern Norwegian; they all share vocabulary and significant parts of grammar with Norwegian), and then a relatively limited exposure to Norwegian itself is sufficient to do fairly well.

They're also really good at "style transfer" of text in the form of tweaking orthography, word order, and minor grammar changes from descriptions and examples.

(incidentally, the latter is one way of getting an LLM to sound a lot less like an LLM)

dzhiurgis 5 hours ago | parent | prev [-]

Model can speak Lithuanian too, but with a Russian accent which is a big taboo for us.