▲ | zozbot234 3 days ago | |
RLHF (and its evolution, RLAIF) is actually used for more than setting "values and preferences". It's what makes AI models engage in recognizable behavior, as opposed to simply continuing a given text. It's how the "Chat" part of "ChatGPT" can be made to work in the first place. | ||
▲ | cs702 3 days ago | parent [-] | |
Yes. I updated my comment to reflect as much. Thank you. |