yes this seems a good way to go. for example you can already find many quantized versions under https://huggingface.co/models?search=apertus%20mlx and elsewhere