| ▲ | wolttam an hour ago | ||||||||||||||||
> inspired to write this post by antirez’s recent project DwarfStar 4, which is a version of llama.cpp that’s been stripped down to run only DeepSeek-V4-Flash This is not true, it is its own project. Indebted to llama.cpp, sure, but not a stripped down version | |||||||||||||||||
| ▲ | antirez an hour ago | parent | next [-] | ||||||||||||||||
Yep, the code overlap is minimal, a few kernels. Some quantization code for the quantizer it implements. DwarfStar 4 is not a fork of llama.cpp, but without llama.cpp the project would be a lot more lacking, since I was able to get all the details that mattered in a second. But it is not a stripped down llama.cpp. This does not reduce in any way how much llama.cpp is not just for this project, but for all the projects that followed and are following. It's not a matter of code: the street to follow, the quants formats, the lessons, the optimized kernels you can check to learn the patterns. | |||||||||||||||||
| ▲ | embedding-shape an hour ago | parent | prev | next [-] | ||||||||||||||||
Truth seems to sit somewhere in-between, DwarfStar 4 seems to mainly exists only because of llama.cpp, and authors basically were very inspired by llama.cpp's code, and even in some places literally have copied pieces from it, all with proper attribution and everything, I'm not trying to say this is bad, seems OK to me: > ds4.c does not link against GGML, but it exists thanks to the path opened by the llama.cpp project and the kernels, quantization formats, GGUF ecosystem, and hard-won engineering knowledge developed there. We are thankful and indebted to llama.cpp and its contributors. Their implementation, kernels, tests, and design choices were an essential reference while building this DeepSeek V4 Flash-specific inference path. Some source-level pieces are retained or adapted here under the MIT license: GGUF quant layouts and tables, CPU quant/dot logic, and certain kernels. For this reason, and because we are genuinely grateful, we keep the GGML authors copyright notice in our LICENSE file. - https://github.com/antirez/ds4#acknowledgements-to-llamacpp-... Been a lot of fun to play around with it since https://news.ycombinator.com/item?id=48142885 (~2 days ago), managed to make the generation go from 47.85 t/s to 57.07 t/s so far :) | |||||||||||||||||
| |||||||||||||||||
| ▲ | an hour ago | parent | prev [-] | ||||||||||||||||
| [deleted] | |||||||||||||||||