Remix.run Logo
zmmmmm 5 hours ago

There's not in the end all that much point having more memory than you can compute on in a reasonable time. So I think probably the useful amount tops out in the 128GB range where you can still run a 70b model and get a useful token rate out of it.