Remix.run Logo
joaogui1 10 hours ago

3.6 is model number, 35B is total number of parameters, A3B means that only 3B parameters are activated, which has some implications for serving (either in you you shard the model, or you can keep the total params on RAM and only road to VRAM what you need to compute the current token, which will make it slower, but at least it runs)