Remix.run Logo
lllllm 4 days ago

benchmarks: we provide plenty in the over 100 page tech report here https://github.com/swiss-ai/apertus-tech-report/blob/main/Ap...

quantizations: available now in MLX https://github.com/ml-explore/mlx-lm (gguf coming soon, not trivial due to new architecture)

model sizes: still many good dense models today lie in the range between our small and large chosen sizes

dcreater 4 days ago | parent [-]

Thank you! Why are the comparisons to llama3.1 era models?

lllllm 4 days ago | parent [-]

we compared to GPT-OSS-20B, Llama 4, Qwen 3, among many others. Which models do you think are missing, among open weights and fully-open models?

Note that we have a specific focus on multilinguality (over 1000 languages supported), not only on english

kamranjon 4 days ago | parent | next [-]

How did it compare with Gemma 3 models? I’ve been impressed with Gemma 27b - but I try out local models frequently and I’m excited to boot up your 70b model on my 128gb MacBook Pro when I get home!

dcreater 4 days ago | parent | prev [-]

ah im sorry, I missed that - im not that blind usually..