| ▲ | embedding-shape 3 hours ago | |||||||||||||||||||||||||||||||||||||
> Framework is ready. Now we need someone to actually train the model. If Microslop aren't gonna train the model themselves to prove their own thesis, why would others? They've had 2 years (I think?) to prove BitNet in at least some way, are you really saying they haven't tried so far? Personally that makes it slightly worrisome to just take what they say at face value, why wouldn't they train and publish a model themselves if this actually led to worthwhile results? | ||||||||||||||||||||||||||||||||||||||
| ▲ | throwaw12 2 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
Because this is Microsoft, experimenting and failing is not encouraged, taking less risky bets and getting promoted is. Also no customer asked them to have 1-bit model, hence PM didn't prioritize it. But it doesn't mean, idea is worthless. You could have said same about Transformers, Google released it, but didn't move forward, turns out it was a great idea. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | GorbachevyChase 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
The most benign answer would be that they don’t want to further support an emerging competitor to OpenAI, which they have significant business ties to. I think the more likely answer which you hinted at is that the utility of the model falls apart as scale increases. They see the approach as a dead end so they are throwing the scraps out to the stray dogs. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | observationist an hour ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
So is it finally time for a Beowulf cluster to do something amazing? | ||||||||||||||||||||||||||||||||||||||
| ▲ | gregman1 2 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
Cannot agree more! | ||||||||||||||||||||||||||||||||||||||