Remix clone Hacker News

new | show | ask | jobs Github

	▲	HarHarVeryFunny 7 months ago
		What are the limitations on which LLMs (specific transformer variants etc) llama.cpp can run? Does it require the input mode/weights to be in some self-describing format like ONNX that support different model architectures as long as they are built out of specific module/layer types, or does it more narrowly only support transformer models parameterized by depth, width, etc?