Remix clone Hacker News

new | show | ask | jobs Github

	▲	minimaxir 3 hours ago
		We really need a replacement for all-MiniLM-L12-v2 that can create more robust embeddings with the same compute. You can technically do Q4 quantization for larger embedding models but I am not sure if that plays nice with ONNX.
	▲	2 hours ago \| parent \| next [-]
		[deleted]
	▲	electroglyph 2 hours ago \| parent \| prev [-]
		it's a pain in the ass to do properly. what we really need it something like auto-round for ONNX