| ▲ | mogili1 9 hours ago | |
Rate limit essentially is a token limit | ||
| ▲ | ibejoeb 8 hours ago | parent | next [-] | |
It depends on how it's implemented. If it's a fixed window, then your absolute ceiling is tokens/windows in a month. If it's a function of other usage, like a timeshare, you're still paying for some price for a month and you get what you get without paying more per token. There's an intrinsic limit based on how many tokens the model can process on that gpu in a month anyway, even if it's only you. | ||
| ▲ | delusional 5 hours ago | parent | prev [-] | |
Time x capacity is also a limit. There's always a limit. | ||