Remix.run Logo
rabf 2 days ago

Positive reinforcement works better that negative reinforcement. If you the read prompt guidance from the companies themselves in their developer documentation it often makes this point. It is more effective to tell them what to do rather than what not to do.

sally_glance 2 days ago | parent | next [-]

This matches my experience. You mostly want to not even mention negative things because if you write something like "don't duplicate existing functionality" you now have "duplicate" in the context...

What works for me is having a second agent or session to review the changes with the reversed constraint, i.e. "check if any of these changes duplicate existing functionality". Not ideal because now everything needs multiple steps or subagents, but I have a hunch that this is one of the deeper technical limitations of current LLM architecture.

citizenpaul a day ago | parent [-]

Probably not related but it reminds me of a book I read where wizards had Additive and Subtractive magic but not always both. The author clearly eventually gave up on trying to come up with creative ways to always add something for solutions after the gimmick wore off and it never comes up again in the book.

Perhaps there is a lesson here.

nomel 2 days ago | parent | prev [-]

Could you describe what this looks like in practice? Say I don't want it to use a certain concept or function. What would "positive reinforcement" look like to exclude something?

oxguy3 2 days ago | parent [-]

Instead of saying "don't use libxyz", say "use only native functions". Instead of "don't use recursion", say "only use loops for iteration".

nomel 2 days ago | parent | next [-]

This doesn't really answer my question, which more about specific exclusions.

Both of the answers show the same problem: if you limit your prompts to positive reinforcement, you're only allowed to "include" regions of a "solution space", which can only constrain the LLM to those small regions. With negative reinforcement, you just cut out a bit of the solution space, leaving the rest available. If you don't already know the exact answer, then leaving the LLM free to use solutions that you may not even be aware of seems like it would always be better.

Specifically:

"use only native functions" for "don't use libxyz" isn't really different than "rewrite libxyz since you aren't allowed to use any alternative library". I think this may be a bad example since it massively constrains the llm, preventing it from using alternative library that you're not aware of.

"only use loops for iteration" for "done use recursion" is reasonable, but I think this falls into the category of "you already know the answer". For example, say you just wanted to avoid a single function for whatever reason (maybe it has a known bug or something), the only way to this "positively" would be to already know the function to use, "use function x"!

Maybe I misunderstand.

bdangubic 2 days ago | parent | prev [-]

I 100% stopped telling them what not to do. I think even if “AGI” is reached telling them “don’t” won’t work

nomel 2 days ago | parent [-]

I have the most success when I provide good context, as in what I'm trying to achieve, in the most high level way possible, then guide things from there. In other words, avoid XY problems [1].

[1] https://xyproblem.info