Remix.run Logo
iamcreasy 20 hours ago

> "open training" is something that won't ever happen for large scale models

https://www.swiss-ai.org/apertus

Source: EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) has released Apertus, Switzerland’s first large-scale open, multilingual language model — a milestone in generative AI for transparency and diversity. Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others. Apertus serves as a building block for developers and organizations for future applications such as chatbots, translation systems, or educational tools. The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

mschuster91 18 hours ago | parent [-]

I wasn't aware of that one, thanks.

Should have been more clear in my wording though - I was referring to commercially useful models.

kittikitti an hour ago | parent [-]

Apertus is available for commercial use based on its license. It doesn't produce state-of-the-art (SOTA) results, but for many organizations, it greatly reduces risk of copyright infringement and even if it does, there is a direct way to address it. In fact, if you were posting in good faith, I would expect people very concerned about questionable training data to be more aware of Apertus.