| ▲ | WatchDog 5 hours ago | ||||||||||||||||||||||||||||||||||||||||||||||
If you want LLMs to have knowledge of the Norwegian language, wouldn't the most obvious thing to do be to build a good training dataset and make the dataset widely available? Why go to the expense of training your own model, especially when it will be inferior to state of the art models. | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | black_puppydog 3 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I task GPT/Claude with researching stuff that pertains to very specific cultural or legal aspects in French politics, on a daily basis. Even though French is a way more common language globally than Norwegian, these models still haven't figured out that, no matter the language I myself speak to them (German or English depending on my mood) their web searches need to be done in French to return reasonable results. I have to remind them every time lest they come back with "uh, didn't find anything relevant, here take some hallucinations instead." So, given the anglo-centrism of current models, my confidence in American providers giving any shits about non-american users/use-cases is pretty low. And lower the smaller the language community is. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | a2128 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
What incentives does OpenAI have to make sure the AI actually works well with Norwegian beyond capturing a (small) Norwegian market? What incentives do they have to take Norwegian values into consideration, or to preserve Norwegian culture into the future? The matter is also a question of national sovereignty, so to simply release the data and nicely ask foreign companies to solve the problem for you, would be a fool's move | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | embedding-shape 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Yeah, was about to comment that too, instead of training a new model and new weights exclusively for Norwegian (and expecting/wanting every other small/medium-sized country to do the same) which seems infinity harder, they could have made high quality transcriptions and translations of the stories currently described only in Norwegian into English, and making it all public. I guess there still would be a worry that it'd be counted as "less important" compared to other history, news and culture about other countries. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | electroglyph 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
absolutely. somebody online was wanting an LLM with Georgian language support, and that's exactly what i suggested: start digitizing Georgian text. | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | _cs2017_ 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||
> Why go to the expense... Answer: idiocy of decision makers and the desire to get resources by those who created the proposal. I assumed Scandinavia has better decision processes but apparently I was wrong. | |||||||||||||||||||||||||||||||||||||||||||||||