| ▲ | baalimago 5 hours ago | |
Yeah it's just a semantic pet peeve. Let me ask you this: What is a "Language Model", if this is a "Large Language Model"? Inversely, if a 1.5B model is "Large" then what is the recent 1T param models? "Superlarge"? In my own very humble opinion, it becomes "Large" when it's out of non-specialized hardware. So currently, a model which requires more than 32GB vram is large (as that's roughly where the high-end gaming GPUs cut off). And btw, there is no way you can train a language model on a CPU, even with ddr5, lest you wait a whole week for a single training cycle. Give it a go! I know I did, it's a magnitude away from being feasible. | ||
| ▲ | joefourier 43 minutes ago | parent [-] | |
Calling anything "large" in computing is problematic since hardware keeps improving. GPT-1 was an LLM in 2017 and had 117M parameters, when did it stop being large? GPT would have been a better term than LLM, but unfortunately became too associated with OpenAI. And then, what about non-transformer LLMs? And multimodal LLMs? Maybe we should just give up, shrug and call it "AI". | ||