DeepSeek releases open-weights math model with IMO gold medal performance

victorbuilds an hour ago | parent | next [-]

Notable: they open-sourced the weights under Apache 2.0, unlike OpenAI and DeepMind whose IMO gold models are still proprietary.

▲

SilverElfin 44 minutes ago | parent [-]

If they open source just weights and not the training code and data, then it’s still proprietary.

▲

falcor84 37 minutes ago | parent | next [-]

Isn't that a bit like saying that if I open source a tool, but not a full compendium of all the code that I had read, which led me to develop it, then it's not really open source?

▲

nextaccountic 12 minutes ago | parent | next [-]

No, it's like saying that if you release under Apache license, it's not open source even though it's under an open source license

For something to be open source it needs to have sources released. Sources are the things in the preferred format to be edited. So the code used for training is obviously source (people can edit the training code to change something about the released weights). Also the training data, under the same rationale: people can select which data is used for training to change the weights

▲

fragmede 33 minutes ago | parent | prev | next [-]

No. In that case, you're providing two things, a binary version of your tool, and the tool's source. That tool's source is available to inspect and build their own copy. However, given just the weights, we don't have the source, and can't inspect what alignment went into it. In the case of DeepSeek, we know they had to purposefully cause their model to consider Tiananmen Square something it shouldn't discuss. But without the source used to create the model, we don't know what else is lurking around inside the model.

▲

NitpickLawyer 18 minutes ago | parent [-]

> However, given just the weights, we don't have the source

This is incorrect, given the definitions in the license.

> (Apache 2.0) "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.

(emphasis mine)

In LLMs, the weights are the preferred form of making modifications. Weights are not compiled from something else. You start with the weights (randomly initialised) and at every step of training you adjust the weights. That is not akin to compilation, for many reasons (both theoretical and practical).

In general licenses do not give you rights over the "know-how" or "processes" in which the licensed parts were created. What you get is the ability to inspect, modify, redistribute the work as you see fit. And most importantly, you modify the work just like the creators modify the work (hence the preferred form). Just not with the same data (i.e. you can modify the source of chrome all you want, just not with the "know-how and knowledge" of a google engineer - the license can not offer that).

This is also covered in the EU AI act btw.

> General-purpose AI models released under free and open-source licences should be considered to ensure high levels of transparency and openness if their parameters, including the weights, the information on the model architecture, and the information on model usage are made publicly available. The licence should be considered to be free and open-source also when it allows users to run, copy, distribute, study, change and improve software and data, including models under the condition that the original provider of the model is credited, the identical or comparable terms of distribution are respected.

▲

fragmede 8 minutes ago | parent [-]

> In LLMs, the weights are the preferred form of making modifications.

No they aren't. We happen to be able to do things to modify the weights, sure, but why would any lab ever train something from scratch if editing weights was preferred?

	▲	NitpickLawyer 4 minutes ago \| parent [-]
		training is modifying the weights. How you modify them is not the object of a license, never was.

▲

nurettin 12 minutes ago | parent | prev [-]

Is this a troll? They don't want to reproduce your open source code, they want to reproduce the weights.

▲

mips_avatar 37 minutes ago | parent | prev | next [-]

Yeah but you can distill

▲

amelius 36 minutes ago | parent [-]

Is that the equivalent of decompile?

	▲	c0balt 20 minutes ago \| parent [-]
		No, that is the equivalent of lossy compression.

▲

amelius 36 minutes ago | parent | prev [-]

True. But the headline says open weights.

▲

ilmj8426 41 minutes ago | parent | prev | next [-]

It's impressive to see how fast open-weights models are catching up in specialized domains like math and reasoning. I'm curious if anyone has tested this model for complex logic tasks in coding? Sometimes strong math performance correlates well with debugging or algorithm generation.

▲

yorwba 25 minutes ago | parent | prev | next [-]

Previous discussion: https://news.ycombinator.com/item?id=46072786 218 points 3 days ago, 48 comments

▲

terespuwash 20 minutes ago | parent | prev [-]

Why isn’t OpenAI’s gold medal-winning model not available to the public yet?