▲ | themanmaran 10 days ago | ||||||||||||||||||||||
Yes, I'll add that to the writeup! You're right, initially excluded it because it was really dependent on the providers, so lots of variance. Especially with the Qwen models. High level results were: - Qwen 32b => $0.33/1000 pages => 53s/page - Qwen 72b => $0.71/1000 pages => 51s/page - Llama 90b => $8.50/1000 pages => 44s/page - Llama 11b => $0.21/1000 pages => 08s/page - Gemma 27b => $0.25/1000 pages => 22s/page - Mistral => $1.00/1000 pages => 03s/page | |||||||||||||||||||||||
▲ | dylan604 10 days ago | parent | next [-] | ||||||||||||||||||||||
One of these things is not like the others. $8.50/1000?? Any chance that's a typo? Otherwise, for someone that has no experience with LLM pricing models, why is Llama 90b so expensive? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | esafak 10 days ago | parent | prev [-] | ||||||||||||||||||||||
A 2d plot would be great |