| ▲ | genpfault 6 hours ago | |||||||
Nice! Getting ~39 tok/s @ ~60% GPU util. (~170W out of 303W per nvtop). System info:
llama.cpp command-line: | ||||||||
| ▲ | halcyonblue 5 hours ago | parent [-] | |||||||
What am I missing here? I thought this model needs 46GB of unified memory for 4-bit quant. Radeon RX 7900 XTX has 24GB of memory right? Hoping to get some insight, thanks in advance! | ||||||||
| ||||||||