| ▲ | Catloafdev 7 hours ago | |||||||
Nobody runs unquantized, there's literally no reason to. Q8 would be the largest anyone actually runs on consumer hardware for inference. | ||||||||
| ▲ | 3 hours ago | parent | next [-] | |||||||
| [deleted] | ||||||||
| ▲ | bityard 3 hours ago | parent | prev [-] | |||||||
Halving the precision of the weights is not a free lunch... | ||||||||
| ||||||||