Remix.run Logo
suprjami 5 days ago

Depending on what you want to do, you already can.

llama.cpp and other inference servers work fine on the kernel driver.

yencabulator 5 days ago | parent [-]

Where "fine" unfortunately still means "don't push it too hard on a busy desktop system or your graphical session might crash". Make sure to keep enough RAM free or you start seeing GPU resets, the stack can't cope with transient errors :-(