| ▲ | iMil 3 hours ago | ||||||||||||||||
Good call, I really hesitated between the X570 and the X99, are you using P2P? | |||||||||||||||||
| ▲ | cybertim 2 hours ago | parent [-] | ||||||||||||||||
$ nvidia-smi topo -p2p r GPU0 GPU1 GPU0 X CNS GPU1 CNS X i guess not, i use llama.cpp with: --spec-draft-n-max 3 --spec-type draft-mtp --split-mode tensor --tensor-split 1,1 and my (gen) tk/s are between 60-80 tk/s will test this uncensored model and ngram added as well this weekend btw, i also set my powerlimit to 220watt per card (with nvidia-smi) that will cost you around 1 tk/s but safe you a LOT of power and heat :) | |||||||||||||||||
| |||||||||||||||||