Remix.run Logo
Show HN: Thought Forgery, a new technique for jailbreaking LLMs
2 points by UltraZartrex 10 hours ago | 7 comments

Hi HN, I'm an independent security researcher and wanted to share a new vulnerability I've discovered.

My account is too new to submit the direct link, so I'm making a text post instead.

The technique is called "Thought Forgery" (CoT Injection). It works by forging the AI's internal monologue, which acts as a universal amplifier for other jailbreaks. I've confirmed it works on the latest models from Google, Anthropic, OpenAI, etc.

I'd be happy to share the link to the full technical write-up on GitHub in the comments if anyone is interested.

ndgold 3 hours ago | parent | next [-]

This is well known

ndgold 3 hours ago | parent [-]

Ok I wouldn’t be able to point to where I’ve read about it, just that I know it already so I assumed it was well known

tjopies 9 hours ago | parent | prev | next [-]

Please do post your write up this is interesting but pretty vague frankly

UltraZartrex 9 hours ago | parent [-]

Sure. you can read it here: https://github.com/SlowLow999/Thought-Forgery/tree/main

alexander2002 9 hours ago | parent | prev | next [-]

sure

UltraZartrex 9 hours ago | parent [-]

Thank you!

UltraZartrex 10 hours ago | parent | prev [-]

[dead]