|
| ▲ | girvo 6 hours ago | parent | next [-] |
| Depends on the model. Step (from StepFun) will happily yap about Tiannemen to you, if you're running it locally. Quite a lot of these models have "safety" (lol) filters in front of them, vs it being heavily encoded into the weights not. |
|
| ▲ | satvikpendem 6 hours ago | parent | prev | next [-] |
| Like the sibling said, you can fine tune if the rejections are in the weights but most often it's actually in the API harness itself; download Qwen or DeepSeek and run it locally to ask about certain dates and squares and it will happily tell you. |
|
| ▲ | atemerev 9 hours ago | parent | prev [-] |
| Well, the weights are open. De-CCP-ing them is a trivial task, about 40 minutes on modern hardware. So can be done for about $50. |
| |
| ▲ | bjelkeman-again 6 hours ago | parent [-] | | Any good reference for how? | | |
| ▲ | ls612 6 hours ago | parent | next [-] | | https://github.com/p-e-w/heretic | | |
| ▲ | atemerev 6 hours ago | parent [-] | | Heretic is a general abliterating framework, mostly used to remove safety alignment, not CCP alignment. Yes, you can put China-specific prompts to it, but you'll need a dataset first (which is available at deccp). Also Heretic as it is does not work for GLM5.2 (at least as of 3 days ago when I tested it). You'll need some hybrid approaches. |
| |
| ▲ | atemerev 6 hours ago | parent | prev [-] | | https://github.com/AUGMXNT/deccp - one example for Qwen models. For GLM 5.2, abliteration/realignment works somewhat differently, but with Claude's help, you can finish the job. I am planning to release the steering patch for the GLM 5.2 eliminating pro-CCP alignment in the next few days. |
|
|