Remix clone Hacker News

new | show | ask | jobs Github

	▲	marssaxman 7 hours ago
		I used vLLM and qwen3-coder-next to batch-process a couple million documents recently. No token quota, no rate limits, just 100% GPU utilization until the job was done.