Remix clone Hacker News

new | show | ask | jobs Github

	▲	akawry 8 days ago
		Take a look at ik_llama.cpp: https://github.com/ikawrakow/ik_llama.cpp CPU performance is much better than mainline llama, as well as having more quantization types available