| ▲ | pokstad 9 hours ago | |
This problem sounds like an excellent opportunity. We need a race to the bottom for hosting LLMs to democratize the tech and lower costs. I cheer on anyone who figures this out. | ||
| ▲ | mememememememo an hour ago | parent [-] | |
This is classic queuing theory, rate limits etc. I don't have an answer but I would look there. | ||