▲ | paulddraper 3 days ago | |
Reductive. Doesn’t explain Deepseek. | ||
▲ | FergusArgyll 3 days ago | parent [-] | |
Deepseek story was way overblown. Read the gpt-oss paper, the actual training run is not the only expense. You have multiple experimental training runs as well as failed training runs. + they were behind SOTA even then |