▲ | ipsum2 4 days ago | |
GGUF is easy to implement, but you'd probably find better performance with tflite on mobile for their custom XNNPACK kernels. Performance is pretty critical on low-power devices. | ||
▲ | HenryNdubuaku 4 days ago | parent [-] | |
We are writing our own backend, but tflite (now called LiteRT) was not faster than GGML when we tested and GGML is already well supported. But we are moving away completely anyway. |