| ▲ | harel 10 hours ago | |
Actually yes. For example, I run local models for ingested documents, summaries, etc. The local models are fine, and there is no need for me to pay for tokens. Performance is adequate for that purpose as well. There are many other cases where I run at scale, time is flexible so things can move slower, and I rather keep it all in house. I'm not even getting into areas where data cannot leave the premises for legal reasons. Right now I'm limited with GPUs mostly. But if that world of local models on Apple silicon is so "good", there is room to expand it to other fruits... | ||