Remix.run Logo
baq 9 hours ago

…as long as you don’t ask them about certain dates or squares.

Also, I wouldn’t expect Mythos-class models to be allowed to be openly released by the CCP. Thinking otherwise is pure naivety.

girvo 6 hours ago | parent | next [-]

Depends on the model. Step (from StepFun) will happily yap about Tiannemen to you, if you're running it locally.

Quite a lot of these models have "safety" (lol) filters in front of them, vs it being heavily encoded into the weights not.

satvikpendem 6 hours ago | parent | prev | next [-]

Like the sibling said, you can fine tune if the rejections are in the weights but most often it's actually in the API harness itself; download Qwen or DeepSeek and run it locally to ask about certain dates and squares and it will happily tell you.

atemerev 9 hours ago | parent | prev [-]

Well, the weights are open. De-CCP-ing them is a trivial task, about 40 minutes on modern hardware. So can be done for about $50.

bjelkeman-again 6 hours ago | parent [-]

Any good reference for how?

ls612 6 hours ago | parent | next [-]

https://github.com/p-e-w/heretic

atemerev 6 hours ago | parent [-]

Heretic is a general abliterating framework, mostly used to remove safety alignment, not CCP alignment. Yes, you can put China-specific prompts to it, but you'll need a dataset first (which is available at deccp).

Also Heretic as it is does not work for GLM5.2 (at least as of 3 days ago when I tested it). You'll need some hybrid approaches.

atemerev 6 hours ago | parent | prev [-]

https://github.com/AUGMXNT/deccp - one example for Qwen models. For GLM 5.2, abliteration/realignment works somewhat differently, but with Claude's help, you can finish the job.

I am planning to release the steering patch for the GLM 5.2 eliminating pro-CCP alignment in the next few days.