▲ | SparkyMcUnicorn 4 days ago | |
Your question is addressed in opening abstract: https://github.com/swiss-ai/apertus-tech-report/raw/refs/hea... > Unlike many prior models that release weights without reproducible data pipelines or regard for content-owner rights, Apertus models are pretrained exclusively on openly available data, retroactively respecting robots.txt exclusions and filtering for copyrighted, non-permissive, toxic, and personally identifiable content. |