Remix clone Hacker News

new | show | ask | jobs Github

	▲	KeplerBoy 3 hours ago
		It's not even hard, just slow. You could do that on a single cheap server (compared to a rack full of GPUs). Run a CPU llm inference engine and limit it to a single thread.