Remix.run Logo
wmf 2 hours ago

That just sounds like a 3090.

cyanydeez 15 minutes ago | parent [-]

not at the vram sizes that control how much context to load; also, GPUs arn't as effiecient as direct inference.