▲ | suprjami 5 days ago | |
Depending on what you want to do, you already can. llama.cpp and other inference servers work fine on the kernel driver. | ||
▲ | yencabulator 5 days ago | parent [-] | |
Where "fine" unfortunately still means "don't push it too hard on a busy desktop system or your graphical session might crash". Make sure to keep enough RAM free or you start seeing GPU resets, the stack can't cope with transient errors :-( |