Remix.run Logo
hnav 19 hours ago

Superior architectures will leak pretty quickly via engineers. Withholding your best models doesn't work unless you have no competition.

ninjagoo 18 hours ago | parent | next [-]

> Superior architectures will leak pretty quickly via engineers.

I agree with the outcome of your premise (i.e., openness), but for different reasons:

First, isn't it the case that these bleeding edge 'newfangled' LLMs are basically variations on the same core ideas from "Attention Is All You Need" from 2017? [1]. Different scale, but still the same basic architecture. Even the "MoE" innovation keeps the Transformer attention stack while replacing or augmenting the dense feed-forward/MLP part with routed expert blocks.

And, I would argue that Engineers aren't working on new architectures. That would be Researchers, working on

  State-space models/Mamba (CMU/Princeton ecosystem), 
  Diffusion Language Models (Inception Labs), 
  Long-convolution architectures/Hyena (Stanford etc.), 
  RWKV/Recurrent LLMs (open-source community), 
  Memory-augmented architectures (Google Research/DeepMind?), 
  World models/spatial intelligence (LeCun/Fei-Fei Li/DeepMind), 
  Symbolic/neurosymbolic alternatives, 
  Thousand brains (Numenta).
That research is still open, so the outcome that you propose (openness) is likely to come to pass. Researchers/Scientists gotta publish, otherwise it's not science (to quote LeCun [2])

[1] https://arxiv.org/abs/1706.03762

[2] https://x.com/ylecun/status/1795589846771147018

2001zhaozhao 19 hours ago | parent | prev [-]

> Withholding your best models doesn't work unless you have no competition.

It could also work if you DO have competition but your compute capacity is overbooked anyway, so releasing the better model doesn't actually make you that much more money (except for raising prices for the same amount of compute, which would give limited gains).

This is pretty much the situation Anthropic is in today.

hnav 18 hours ago | parent [-]

That just means that Anthropic is fucked unless they get more capacity.