They're not blenders.

This is clear from the fact that you can distill the logic ability from a 700b parameter model into a 14b model and maintain almost all of it.

You just lose knowledge, which can be provided externally, and which is the actual "pirated" part.

The logic is _learned_

It hasn't learned any LOGIC. It has 'learned' patterns from the input.

What is logic other than applying patterns?

The definition is broad for now this will do: Logic is the study of correct reasoning.

	▲	vidarh 16 hours ago \| parent [-]
		How is that different from applying patterns?

bayindirh 2 days ago | parent | prev [-]

Are there any recent publications about it so I can refresh myself on the matter?

	▲	D-Machine 2 days ago \| parent [-]
		You won't find any trustworthy papers on the topic because GP is simply wrong here. That models can be distilled has no bearing whatsoever on whether a model has learned actual knowledge or understanding ("logic"). Models have always learned sparse/approximately-sparse and/or redundant weights, but they are still all doing manifold-fitting. The resulting embeddings from such fitting reflect semantics and semantic patterns. For LLMs trained on the internet, the semantic patterns learned are linguistic, which are not just strictly logical, but also reflect emotional, connotational, conventional, and frequent patterns, all of which can be illogical or just wrong. While linguistic semantic patterns are correlated with logical patterns in some cases, this is simply not true in general.