Ugh, I'm past the edit window, but I meant RLHF aka "Reinforced Learning from Human Feedback", I'm not sure how I messed that up not once but twice!
After the first mess up, the context was poisoned :)