Remix.run Logo
Catloafdev 17 hours ago

And how exactly do you propose making it "difficult enough"?

reissbaker 17 hours ago | parent | next [-]

The same way Anthropic is making it difficult to compete with them. They intentionally train the model (via PEFT, as called out in the model card) to be dumber when attempting to do things Anthropic doesn't want — in this case, competing with them, but you could apply the same training process for other domains such as actually-malicious use cases.

fc417fc802 16 hours ago | parent | prev [-]

The same way pursuing a bachelor's degree in order to achieve a nefarious end goal does. Refuse to handhold the user on risky topics and outright refuse to answer if an explicit scenario that appears to be harmful is provided. Provide only textbook level technical explanations for such topics the same as any STEM student has ready access to.