▲ | dcreater 4 days ago | |||||||||||||||||||||||||||||||
I want and hope this to succeed. But the tea leaves don't look good at the moment: - model sizes that the industry was at 2-3 gens ago (llama 3.1 era) - Conspicuous lack of benchmark results in announcements - not on openrouter, no ggufs as yet | ||||||||||||||||||||||||||||||||
▲ | lllllm 4 days ago | parent [-] | |||||||||||||||||||||||||||||||
benchmarks: we provide plenty in the over 100 page tech report here https://github.com/swiss-ai/apertus-tech-report/blob/main/Ap... quantizations: available now in MLX https://github.com/ml-explore/mlx-lm (gguf coming soon, not trivial due to new architecture) model sizes: still many good dense models today lie in the range between our small and large chosen sizes | ||||||||||||||||||||||||||||||||
|