Remix.run Logo
nomel a day ago

Could you describe what this looks like in practice? Say I don't want it to use a certain concept or function. What would "positive reinforcement" look like to exclude something?

oxguy3 a day ago | parent [-]

Instead of saying "don't use libxyz", say "use only native functions". Instead of "don't use recursion", say "only use loops for iteration".

nomel a day ago | parent | next [-]

This doesn't really answer my question, which more about specific exclusions.

Both of the answers show the same problem: if you limit your prompts to positive reinforcement, you're only allowed to "include" regions of a "solution space", which can only constrain the LLM to those small regions. With negative reinforcement, you just cut out a bit of the solution space, leaving the rest available. If you don't already know the exact answer, then leaving the LLM free to use solutions that you may not even be aware of seems like it would always be better.

Specifically:

"use only native functions" for "don't use libxyz" isn't really different than "rewrite libxyz since you aren't allowed to use any alternative library". I think this may be a bad example since it massively constrains the llm, preventing it from using alternative library that you're not aware of.

"only use loops for iteration" for "done use recursion" is reasonable, but I think this falls into the category of "you already know the answer". For example, say you just wanted to avoid a single function for whatever reason (maybe it has a known bug or something), the only way to this "positively" would be to already know the function to use, "use function x"!

Maybe I misunderstand.

bdangubic a day ago | parent | prev [-]

I 100% stopped telling them what not to do. I think even if “AGI” is reached telling them “don’t” won’t work

nomel a day ago | parent [-]

I have the most success when I provide good context, as in what I'm trying to achieve, in the most high level way possible, then guide things from there. In other words, avoid XY problems [1].

[1] https://xyproblem.info