Remix.run Logo
geremiiah 9 hours ago

Their representation is simpler, just a transformer. That means you can just plug in all the theory and tools that have been developed specifically for transformers, most importantly you can scale the model easier. But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset. Indeed lots of people who have experimented with AlphaFold have found it to behave similiar to LLMs, i.e. it performs well on inputs close to the training dataset and but it doesn't generalize well at all.

johncolanduoni 2 hours ago | parent | next [-]

Except their dataset is mostly the output of AlphaFold, which had to use the much smaller dataset of proteins analyzed by crystallography as input. This is really an exercise in model distillation - a worthy endeavor but it's not like they could have just taken their architecture and the dataset AlphaFold had and expect to get the same results. If that was the case, that's what they would have done because it would've been much more impressive.

aDyslecticCrow 6 hours ago | parent | prev | next [-]

I think the sentiment that simplicity is good, is a false conclusion. Simplicity is simply good scientific methodology.

Doing too many things at once makes methods hard to adopt and makes conclusions harder to draw. So we try to find simple methods that show measurable gain, so we can adapt it to future approaches.

Its a cycle between complexity and simplicity. When a new simple and scalable approach beats the previous state of art, that just means we discovered a new local maxima hill to climp up.

visarga 8 hours ago | parent | prev [-]

> But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset.

People often like to say that we just need one more algorithmic breakthrough or two for AGI. But in reality it's the dataset and the environment based learning. Almost any model would do if you collected the data. It's not in the model, it's outside where we need to work on.