Remix.run Logo
miohtama 20 hours ago

I am not expert here, so want to ask what's magical about 405B number?

daveguy 20 hours ago | parent [-]

That's the size of the largest, most capable, open source models. Specifically Llama 3.1 has 405B parameters. Deepseek's largest model is 671B parameters.

mhitza 20 hours ago | parent [-]

Small corrections. Llama 3.1 is not an Open Source model, but a Llama 3.1 Licensed model. Neither is DeepSeek apparently https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/LIC... which I was of the false opinion that it is. Though I never considered using it, so haven't checked the license before.

gunalx 18 hours ago | parent | next [-]

Both deepseek R1 and V3-0324 is mit licensed.

Der_Einzige 15 hours ago | parent | prev [-]

You can just ignore the license since the existence of these models is based on piracy at a scale never before seen. Aaron Swartz couldn’t have even imagined violating copyright that hard.

If you live in a glass house, you won’t throw stones. No one in the LLM space wants to be litigious

It’s an open secret that DeepSeek used a ton of OpenAI continuations both in pre training and in the distillation. That totally violates openAI TOS. No one cares.

LoganDark 14 hours ago | parent [-]

> No one in the LLM space wants to be litigious

Except for OpenAI.