Remix clone Hacker News

new | show | ask | jobs Github

	▲	baalimago 5 hours ago
		Yeah it's just a semantic pet peeve. Let me ask you this: What is a "Language Model", if this is a "Large Language Model"? Inversely, if a 1.5B model is "Large" then what is the recent 1T param models? "Superlarge"? In my own very humble opinion, it becomes "Large" when it's out of non-specialized hardware. So currently, a model which requires more than 32GB vram is large (as that's roughly where the high-end gaming GPUs cut off). And btw, there is no way you can train a language model on a CPU, even with ddr5, lest you wait a whole week for a single training cycle. Give it a go! I know I did, it's a magnitude away from being feasible.
	▲	joefourier 43 minutes ago \| parent [-]
		Calling anything "large" in computing is problematic since hardware keeps improving. GPT-1 was an LLM in 2017 and had 117M parameters, when did it stop being large? GPT would have been a better term than LLM, but unfortunately became too associated with OpenAI. And then, what about non-transformer LLMs? And multimodal LLMs? Maybe we should just give up, shrug and call it "AI".