| ▲ | EnPissant 2 days ago | ||||||||||||||||
I don't mean to be a jerk, but 2-bit quant, reducing experts from 10 to 4, who knows if the test is running long enough for the SSD to thermal throttle, and still only getting 5.5 tokens/s does not sound useful to me. | |||||||||||||||||
| ▲ | simonw 2 days ago | parent [-] | ||||||||||||||||
It's a lot more useful than being entirely unable to try out the model. | |||||||||||||||||
| |||||||||||||||||