Remix clone Hacker News

new | show | ask | jobs Github

	▲	cadamsdotcom 2 hours ago
		Transformers scale poorly vs. context window size and parameter count. Which means really impressive when those N’s are small! I’m but a pundit in this area so don’t know much. But one wonders if there’s a future in burning larger models to FPGAs - whether big enough FPGAs exist (or can be built), and whether locating specialized compute right with the memory it needs can speed things up. Likely would need a lot of algorithm parallelism work that’d translate back to CPUs/GPUs.
	▲	T-A an hour ago \| parent [-]
		Related: https://www.spheron.network/blog/etched-ai-sohu-vs-nvidia-tr...