Remix clone Hacker News

new | show | ask | jobs Github

	▲	rao-v 6 days ago
		Fabulous stuff! Oh my request … the vision head on the Gemma models is super slow on CPU inferencing (and via Vulcan), even via llama.cpp. Any chance your team can figure out a solve? Other ViTs don’t have the same problem.