| ▲ | cowlby 3 hours ago | |
Using tweakcc I can see the system prompt is supposed to mean “if it’s malware, refuse to improve or augment the code”. But due to all the malware noise it’s confusing the instruction as “don’t improve or augment after reading”. I thought this was integral to LLM context design. LLMs can’t prompt their way to controls like this. Surprised they took such a hard headed approach to try and manage cybersecurity risks. | ||