If you don’t care about the output, why not reduce to 1-bit and only 1 active expert? It will be completely useless but it will be faster!