| ▲ | dofm 10 hours ago | |
Well, many of us who shared hardware also ran monitoring to make sure the share was fair; there used to be a whole industry for that sort of quota stuff. You can presumably hard-limit LLMs the same way — total, burst quotas etc. (Suddenly getting a very fun flashback to the environment in which someone first explained Markov chains to me — MediaMOO. A text-based chat environment with configurable limits on the number of CPU "ticks" you were allowed in order to do things) | ||