| ▲ | JPLeRouzic 9 hours ago | |||||||
Has anyone started to implement this technique in Llama.cpp or similar inference tool? | ||||||||
| ▲ | dnhkng 9 hours ago | parent [-] | |||||||
There was some work done on this a while back, during the FrankenMerge craze of 23' I am working with TurboDerp to integrate this into the Exllama v3 format. | ||||||||
| ||||||||