Remix.run Logo
bfung 3 days ago

Yep, the Attention mechanism in the Transformer arch is pretty good.

Probably need another cycle of similar breakthrough in model engineering before this more complex neural network gets a step function better.

Moar data ain’t gonna help. The human brain is the proof: it doesnt need the internet’s worth of data to become good (nor all that much energy).