Remix.run Logo
stephenpontes 10 hours ago

I remember first hearing about protein folding with the Folding @Home project (https://foldingathome.org) back when I had a spare media server and energy was cheap (free) in my college dorm. I'm not knowledgable on this, but have we come a long way in terms of making protein folding simpler on today's hardware, or is this only applicable to certain types of problems?

It seems like the Folding @Home project is still around!

roughly 9 hours ago | parent | next [-]

As I understand it, folding at home was a physics based simulation solver, whereas alphafold and its progeny (including this) are statistical methods. The statistical methods are much, much cheaper computationally, but rely on existing protein folds and can’t generate strong predictions for proteins that don’t have some similarities to proteins in their training set.

In other words, it’s a different approach that trades off versatility for speed, but that trade off is significant enough to make it viable to generate protein folds for really any protein you’re interested in - it moves folding from something that’s almost computationally infeasible for most projects to something that you can just do for any protein as part of a normal workflow.

cowsandmilk 4 hours ago | parent | next [-]

1. I would be hesitant to not categorize folding@home as statistics based; they use Markov state models which is very much based on statistics. And their current force fields are parameterized via machine learning ( https://pubs.acs.org/doi/10.1021/acs.jctc.0c00355 ).

2. The biggest difference between folding@home and alphafold is that folding@home tries to generate the full folding trajectory while alphafold is just protein structure prediction; only looking to match the folded crystal structure. Folding@home can do things like look into how a mutation may make a protein take longer to fold or be more or less stable in its folded state. Alphafold doesn’t try to do that.

roughly 2 hours ago | parent [-]

You’re right, that’s true - I’d glossed over the folding@ methodology a bit. I think the core distinction is still that Folding is trying to divine the fold via simulation, while Alphafold is playing closer to a gpt-style predictor relying on training data.

I actually really like Alphafold because of that - the core recognition that an amino acid string’s relationship to the structure and function of the protein was akin to the cross-interactions of words in a paragraph to the overall meaning of the excerpt is one of those beautiful revelations that come along only so often and are typically marked by leaps like what Alphafold was for the field. The technique has a lot of limitations, but it’s the kind of field cross-pollination that always generates the most interesting new developments.

6 hours ago | parent | prev [-]
[deleted]
_joel 9 hours ago | parent | prev | next [-]

Yep, that and SETI@Home. I loved the eye candy, even if I didn't know what it fully meant.

gregsadetsky 9 hours ago | parent | next [-]

That and project RC5 from the same time period..! :-)

https://www.distributed.net/RC5

https://en.wikipedia.org/wiki/RSA_Secret-Key_Challenge

I wonder what kind of performance would I get on a M1 computer today... haha

EDIT: people are still participating in rc5-72...?? https://stats.distributed.net/projects.php?project_id=8

seydor 9 hours ago | parent | prev [-]

How come we don't have AI@Home

throwup238 9 hours ago | parent | next [-]

The network bandwidth between nodes is a bigger limitation than compute. The newest Nvidia cards come with 400gbit busses now to communicate between them, even on a single motherboard.

Compared to SETI or Folding @Home, this would work glacially slow for AI models.

fourthark 8 hours ago | parent [-]

Seems like training would be a better match, where you need tons of compute but don’t care about latency.

ronsor 36 minutes ago | parent [-]

No, the problem is that with training, you do care about latency, and you need a crap-ton of bandwidth too! Think of the all_gather; think of the gradients! Inference is actually easier to distribute.

6 hours ago | parent | prev [-]
[deleted]
jffry 6 hours ago | parent | prev | next [-]

Apparently from a F@H blog post [1] they say it's still useful to know the dynamics of how it folded, in addition to the final folded shape. And that having ML-folded proteins is a rich target for simulation to validate and to understand how the protein works

[1] https://foldingathome.org/2024/05/02/alphafold-opens-new-opp...

EasyMark 6 hours ago | parent | prev | next [-]

They're still going and have made some great discoveries over the years.

https://foldingathome.org/papers-results/?lng=en

ge96 8 hours ago | parent | prev | next [-]

I contributed a lot on there too used my 3080Ti-FE as a small heater in the winter

EasyMark 6 hours ago | parent [-]

lol I still run it in the winter but I feel bad running it in the summer, so I don't run it when A/C or heating is not necessary. I figure some contribution is infinitely more than 0 contribution.

nkjoep 10 hours ago | parent | prev [-]

Team F@H forever!