| ▲ | stingraycharles a day ago |
| I don’t think it will ever make sense; you can buy so much cloud based usage for this type of price. From my perspective, the biggest problem is that I am just not going to be using it 24/7. Which means I’m not getting nearly as much value out of it as the cloud based vendors do from their hardware. Last but not least, if I want to run queries against open source models, I prefer to use a provider like Groq or Cerebras as it’s extremely convenient to have the query results nearly instantly. |
|
| ▲ | websiteapi a day ago | parent | next [-] |
| my issue is once you have it in your workflow I'd be pretty latency sensitive. imagine those record-it-all apps working well. eventually you'd become pretty reliant on it. I don't want to necessarily be at the whims of the cloud |
| |
| ▲ | stingraycharles a day ago | parent [-] | | Aren’t those “record it all” applications implemented as a RAG and injected into the context based on embedding similarity? Obviously you’re not going to always inject everything into the context window. |
|
|
| ▲ | a day ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | lordswork a day ago | parent | prev | next [-] |
| As long as you're willing to wait up to an hour for your GPU to get scheduled when you do want to use it. |
| |
| ▲ | stingraycharles a day ago | parent [-] | | I don’t understand what you’re saying. What’s preventing you from using eg OpenRouter to run a query against Kimi-K2 from whatever provider? | | |
| ▲ | hu3 a day ago | parent | next [-] | | and you'll get a faster model this way | |
| ▲ | bgwalter a day ago | parent | prev [-] | | Because you have Cloudflare (MITM 1), Openrouter (MITM 2) and finally the "AI" provider who can all read, store, analyze and resell your queries. EDIT: Thanks for downvoting what is literally one of the most important reasons for people to use local models. Denying and censoring reality does not prevent the bubble from bursting. | | |
| ▲ | irthomasthomas a day ago | parent [-] | | you can use chutes.ai TEE (Trusted Execution Environment) and Kimi K2 is running at about 100t/s rn |
|
|
|
|
| ▲ | givinguflac a day ago | parent | prev [-] |
| I think you’re missing the whole point, which is not using cloud compute. |
| |
| ▲ | stingraycharles a day ago | parent [-] | | Because of privacy reasons? Yeah I’m not going to spend a small fortune for that to be able to use these types of models. | | |
| ▲ | givinguflac a day ago | parent [-] | | There are plenty of examples and reasons to do so besides privacy- because one can, because it’s cool, for research, for fine tuning, etc. I never mentioned privacy. Your use case is not everyone’s. | | |
| ▲ | wyre a day ago | parent [-] | | All of those things you can still do renting AI server compute though? I think privacy and cool-factor are the only real reasons why it would be rational for someone to spend checks the apple store $19,000 on computer hardware... | | |
| ▲ | givinguflac 8 hours ago | parent [-] | | Why do you look at this as a consumer? Have you never heard of businesses spending money on hardware??? |
|
|
|
|