Remix clone Hacker News

new | show | ask | jobs Github

	▲	misiti3780 2 days ago
		what HW are you running them on ? are you using OLLAMA ?
	▲	vunderba 2 days ago \| parent [-]
		I'm using the default llama-server that is part of Gerganov's LLM inference system running on a headless machine with an nVidia 16GB GPU, but Ollama's a bit easier to ease into since they have a preset model library. https://github.com/ggml-org/llama.cpp