Remix clone Hacker News

new | show | ask | jobs Github

	▲	kuil009 4 hours ago
		The positioning makes sense, but I’m still somewhat skeptical. Targeting power, cooling, and TCO limits for inference is real, especially in air-cooled data centers. But the benchmarks shown are narrow, and it’s unclear how well this generalizes across models and mixed production workloads. GPUs are inefficient here, but their flexibility still matters.