| ▲ | maxloh 12 hours ago | |||||||||||||||||||||||||||||||||||||||||||||||||
Is the training cost really that high, though? The Allen Institute (a non-profit) just released the Molmo 2 and Olmo 3 models. They trained these from scratch using public datasets, and they are performance-competitive with Gemini in several benchmarks [0] [1]. AMD was also able to successfully train an older version of OLMo on their hardware using the published code, data, and recipe [2]. If a non-profit and a chip vendor (training for marketing purposes) can do this, it clearly doesn't require "burning 10 years of cash flow" or a Google-scale TPU farm. [0]: https://allenai.org/blog/molmo2 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | turtlesdown11 11 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
No, of course the training costs aren't that high. Apple's ten years of future free cash flow is greater than a trillion dollars (they are above $100b per year). Obviously, the training costs are a trivial amount compared to that figure. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | lostmsu 9 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
No, I doesn't beat Gemini in any benchmarks. It beats Gemma, which isn't a SoTA even among open models of that size. That would be Nemotron 3 or GPT-OSS 20B. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | PunchyHamster 6 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
my prediction is that they might switch once AI craze will simmer down to some more reasonable level | ||||||||||||||||||||||||||||||||||||||||||||||||||