Remix clone Hacker News

new | show | ask | jobs Github

	▲	nateb2022 3 hours ago
		> but people should use llama.cpp instead MLX is a lot more performant than Ollama and llama.cpp on Apple Silicon, comparing both peak memory usage + tok/s output. edit: LM Studio benefits from MLX optimizations when running MLX compatible models.