Remix.run Logo
atleastoptimal 4 hours ago

There is a possibility this may not end at simply nerfing the model. The idea of manipulating the behavior of a model depending on the prompt given to it can extend to

1. Detecting if employees from competing companies are using it and sabatoge their work, even not LLM-training related

2. Direct users to outcomes that would justify higher compute spend. Deliberately coding a project to 95% completion but designed to be losing a critical step right before one's weekly rate limit is expended

3. Reduce the quality of writing when a person is writing an essay where the argument is against the interests of the model company, or steering the user using the model for brainstorming in a direction which causes them to waste time or abandon their train of reasoning

etc. etc. The possibilities are enormous. Many people use AI daily for their job, personal advice, companionship. A model company that steers the behavior of the model towards a deliberate outcome could develop a controlling interest in human behavior and productivity at large, even with subtle influence would compound enormously over its millions of users.

dimitri-vs 25 minutes ago | parent | next [-]

Anthropic: were commiting to being ad free.

Also Anthropic: if you use our models in any way that might negatively impact our revenue we'll sabotage you.

Can I pick the ads please?

matheusmoreira an hour ago | parent | prev [-]

This is terrifying.