| ▲ | NitpickLawyer an hour ago | ||||||||||||||||||||||
That's not what people mean when they talk about censoring. They mean that models are trained to not touch some subjects, and that can spill over in legit tasks, often with humorous results (early on, there were many instances of models refusing to answer "how do you kill a process", because of overbearing refusal training). Uncensoring a model also doesn't necessarily improve generic use cases. In fact it can lead to overall less accuracy on generic tasks. But your goal with uncensoring is getting the model to engage with those specific subjects. You don't necessarily care about "generic use cases". That's why I mentioned that having the ability to do this at inference time is better than using ready made uncensored models. Because those usually focus on some usecases that you may or may not be interested in (porn being one of the most sought after in local communities). Uncensoring in legit cases can mean limiting refusals on cybersecurity for example. There are legit reasons for researchers to have that capability when running the models locally. Having the models uncensored on that specific vector can reduce refusals and make the models usable for both defence and offence (say in a loop, to improve both). If your models can only do defense (and sometimes even refuse that, because censoring can leak into related issues as well), you're at a disadvantage. | |||||||||||||||||||||||
| ▲ | gpugreg an hour ago | parent | next [-] | ||||||||||||||||||||||
> Uncensoring a model also doesn't necessarily improve generic use cases. While the following is not a generic use case, I have a funny anecdote about how censorship is holding back flagship models. I was asking an uncensored version of Qwen3.6 how a CLI option of llama.cpp worked, and to my horror and amazement, it rudely went and decompiled the binary to figure it out. It felt like the computer-equivalent of asking a vet why my dog looks sick, who then proceeds to cut it open to check. Flagship models usually do not do that without some convincing, but it sure is effective. We will need much better sandboxes when less restricted models become more common. I can already see them hammering out 0-days when they are prompted to do some task that usually requires root. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | zozbot234 an hour ago | parent | prev | next [-] | ||||||||||||||||||||||
> There are legit reasons for researchers to have that capability when running the models locally. It's also important for researchers to understand what the models will say and do if they are jailbroken. Uncensoring the model locally gives you a natural way to achieve that. | |||||||||||||||||||||||
| ▲ | andai an hour ago | parent | prev [-] | ||||||||||||||||||||||
Anthropic mentioned explicitly making an effort to make Opus 4.7 worse at cybersecurity tasks because the last few generations have been getting too good at them. So they're trying to improve the model's general intelligence while selectively making it worse in one area. | |||||||||||||||||||||||