▲ | simonw 4 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||
The Open Source Initiative themselves decided a last year to relax their standards for AI models: they don't require the training data to be released. https://opensource.org/ai They do continue to require the core freedoms, most importantly "Use the system for any purpose and without having to ask for permission". That's why a lot of the custom licenses (Llama etc) don't fit the OSI definition. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | thewebguyd 4 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
> The Open Source Initiative themselves decided a last year to relax their standards for AI models: they don't require the training data to be released. Poor move IMO. Training data should be required to be released to be considered an open source model. Without it all I can do is set weights, etc. Without training data I can't truly reproduce the model, inspect the data for biases/audit the model for fairness, make improvements & redistribute (a core open source ethos). Keeping the training data closed means it's not truly open. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | amelius 4 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I don't agree with that definition. For a given model I want to know what I can/cannot expect from it. To have a better understanding of that, I need to know what it was trained on. For a (somewhat extreme) example, what if I use the model to write children's stories, and suddenly it regurgitates Mein Kampf? That would certainly ruin the day. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|