| ▲ | materielle 2 hours ago | ||||||||||||||||||||||||||||||||||||||||
I'm about to leave a shallow comment, but I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop? So the fact that publicly available information is conflicted is probably a sign that at the very least, the numbers aren't amazing. Yes I know there's no evidence and this is lazy reasoning. But there's probably a bit of truth to this line of thought. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | Tuna-Fish 2 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
Why on earth would AI labs be bragging about how little the product they sell actually costs them to make? You don't want to do anything that reduces it's perceived value to the user, that might make them less willing to pay for it. Also, inference costs are bound to go way down with more optimized architectures. GPUs are fundamentally not great at inference. No platform where the weights are streamed from a large pool of memory is. If the models ever quiet down, there will be massive step changes in cost/token, energy/token and tokens/second, as models are etched into silicon ala https://chatjimmy.ai/ | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | whatshisface 2 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
Inference has traditionally been far less expensive than training. One public example is the fact that hobbyists can run StableDiffusion ($600k training costs[1]) on their personal computers. Speaking to your point, inference being dramatically less costly than training would not be seen as a delta from the norm. The model of providing inference for anything near the operational costs (like a utility would), would the delta from the norm if it were true. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||