| ▲ | aspenmartin 3 hours ago | |
Batching helps with efficiency but you can’t fit opus into anything less than hundreds of thousands of dollars in equipment Local models are more than a useful middle ground they are essential and will never go away, I was just addressing the OPs question about why he observed the difference he did. One is an API call to the worlds most advanced compute infrastructure and another is running on a $500 CPU. Lots of uses for small, medium, and larger models they all have important places!! | ||