Remix clone Hacker News

new | show | ask | jobs Github

	▲	TrueDuality 5 days ago
		We're currently running ~30 Llama 3.1 models each with a different fine-tuned LoRa layer for their specific tasks. There was some initial pain as we refined the prompts but have been stable and happy for a while. Since the Qwen3 0.6B model came out we've been training those. We can't quite compare apples-to-apples, we have a better deeper training data-set from pathological cases and exceptional cases that came out of our production environment. Those right now are looking like they're about at parity with our existing stack for quality and quite a bit faster. I'm going to try and run through one of our training regimen with this model and see how it compares. Not quite running models this small yet, but it wouldn't surprise me if we could.