Remix.run Logo
catlifeonmars 11 hours ago

I am curious, does this mean that you can escape the chat template “early” by providing an end token in the user input, or is there also an escape mechanism (or token filtering mechanism) applied to user input to avoid this sort of injection attack?

reactordev 10 hours ago | parent [-]

Neither, it’s just not providing the base chat template that the model expects between the im tags. This isn’t a hack and it’s not particularly useful information. Abliteration is what he really wanted

catlifeonmars 10 hours ago | parent [-]

I am merely curious what happens when you throw random <im…> tags in the input. I understand that’s orthogonal to abliteration.

reactordev 9 hours ago | parent [-]

Depends on the model. Some just go into “immediate mode” and just do whatever you ask, others operate fine but have trouble with tasks/tools. While others will go down a quant that was basically neglected since inception and you get garbage back. Random chars or endless loops.