Remix.run Logo
YetAnotherNick 4 hours ago

Depends on if you are using tensor parallelism or pipeline parallelism, in the second case you don't need any sharing.