| ▲ | mzl 4 hours ago | |
While it is possible to self-host small models, it is not easy to host them with high speeds. Many small-model use-cases are for large batches of work (processing large amounts of documents, agentic workflows, ...), and then using a provider that has high tps numbers would be motivated. Still, I agree that self-hosting is probably a part of the decrease. | ||