| ▲ | Aurornis 5 hours ago | |||||||||||||||||||||||||
Good points. The easy experimentation factor is helpful for development, though I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale. OpenRouter is also a good place to find free LLM access with a catch: You should expect that any inputs and outputs are going into someone's training database. Clearly anyone who can pay should be using paid models with privacy protections, but the free models have been great for learning and experimenting. Especially for younger people learning API programming and LLMs who may not have access to a credit card or funds. | ||||||||||||||||||||||||||
| ▲ | derefr 4 hours ago | parent | next [-] | |||||||||||||||||||||||||
> You should expect that any inputs and outputs are going into someone's training database. True enough, in theory; but what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material? Even a single OpenRouter-API-key-identified subscriber's traffic, may consist of an mixture of traffic from multiple different sessions, under potentially multiple different end-users. (Where, if the subscriber is doing security correctly, then their OpenRouter key lives on a gateway rather than in a frontend app; and so the only IP address / UA / etc OpenRouter sees is that of the gateway itself.) And the traffic stream may also invoke multiple models, and provide multiple different system prompts for those models; which, while marked in the traffic (i.e. conveyed as part of each request), makes the resulting data much less useful in aggregate, than if it were all training data for one model with one system prompt. Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.) Analysis would have to rely on pure transcript-level user sentiment extraction. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | bix6 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
It’s interesting all the focus on opt-out from training. Sometimes I worry there is an intentional focus on that so people don’t think about the other ways the company might be profiting off our data. Like I pay for Anthropic and they don’t train on that but are they selling my “anonymized” usage data in some other way? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | tasuki 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
> Clearly anyone who can pay should be using paid models with privacy protections Clearly, anyone who needs privacy should be using models with privacy protections. Some people build open source and the models will get the code anyway. | ||||||||||||||||||||||||||
| ▲ | derac 4 hours ago | parent | prev [-] | |||||||||||||||||||||||||
I recommend nvidia nim for completely free dev access for young people. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||