| ▲ | Philpax 2 hours ago | |
Single-file deployments were an intentional design goal on my part. While most image models were/are single-file, LLM safetensors (at least at the time) were not, and I wanted to ensure that we enforced that at a structural level. I also didn't want to mandate a JSON reader for executors (e.g. llama.cpp), which the ST approach would have required. The bigger issue at the time, if I recall, was that ST couldn't support the new-and-upcoming quants that GGML had, and having our own file format offered us flexibility that ST couldn't. | ||