Remix.run Logo
cyanydeez 3 hours ago

not at the vram sizes that control how much context to load; also, GPUs arn't as effiecient as direct inference.

wmf 25 minutes ago | parent [-]

OK, B70.