Remix.run Logo
arboles 5 hours ago

Please elaborate.

hugmynutus 5 hours ago | parent | next [-]

This is because LLMs don't actually understand language, they're just a "which word fragment comes next machine".

    Instruction: don't think about ${term}
Now `${term}` is in the LLMs context window. Then the attention system will amply the logits related to `${term}` based on how often `${term}` appeared in chat. This is just how text gets transformed into numbers for the LLM to process. Relational structure of transformers will similarly amplify tokens related to `${term}` single that is what training is about, you said `fruit`, so `apple`, `orange`, `pear`, etc. all become more likely to get spat out.

The negation of a term (do not under any circumstances do X) generally does not work unless they've received extensive training & fining tuning to ensure a specific "Do not generate X" will influence every single down stream weight (multiple times), which they often do for writing style & specific (illegal) terms. So for drafting emails or chatting, works fine.

But when you start getting into advanced technical concepts & profession specific jargon, not at all.

arcanemachiner 5 hours ago | parent | prev [-]

Pink elephant problem: Don't think about a pink elephant.

OK. Now, what are you thinking about? Pink elephants.

Same problem applies to LLMs.

5 hours ago | parent [-]
[deleted]