| ▲ | pants2 4 hours ago | |
Not similar. DeepInfra[1] has DS4 Pro pricing at $1.30/$2.60 which is 3X the Deepseek[2] (Chinese) hosting at $0.435/$0.87. DeepInfra is also very slow at 37 t/s and uses an FP4 quant[3], so intelligence will be degraded slightly. Meanwhile you could use Grok 4.3 for the same price which is smarter and 5X faster[4]. 1. https://deepinfra.com/pricing 2. https://api-docs.deepseek.com/quick_start/pricing 3. https://artificialanalysis.ai/models/deepseek-v4-pro/provide... | ||
| ▲ | wirybeige 2 hours ago | parent [-] | |
DS4 Pro/Flash were post trained with QAT, so they are already quantized to FP4 for the most part. That's why when downloading the weights, they are much smaller than what their weights at fp8 or fp16 would be. For example, Flash is a 284B model, but its GB size is only ~160GB. OFC maybe DeeppInfra went even further, but there is no proof of that. | ||