Remix.run Logo
xpct 5 hours ago

That would depend on what gets leaked, as I'm not so sure that the weights by themselves would be enough to replicate the architecture. I imagine some part of the secret sauce will remain in the architecture, and the tensor dimensions may not be enough to decode it.

I'm sure if proprietary models continue to be a big thing, the methodology of their storage and loading on hardware will be obfuscated quite a bit.

anonzzzies 4 hours ago | parent [-]

But you can see this is not true (yet); competitors/Chinese labs are less than 6 months behind: either via leaks or by just stumbling on the same improvements with time/effort.