Remix.run Logo
zozbot234 3 days ago

RLHF (and its evolution, RLAIF) is actually used for more than setting "values and preferences". It's what makes AI models engage in recognizable behavior, as opposed to simply continuing a given text. It's how the "Chat" part of "ChatGPT" can be made to work in the first place.

cs702 3 days ago | parent [-]

Yes. I updated my comment to reflect as much. Thank you.