Remix.run Logo
og_kalu 2 days ago

I don't understand what you're getting at here. No idea why you've put tweets from a random? person to make your point.

Yes these guys have all noted on the effects of post-training on the models.

"We want people to know that they’re interacting with a language model and not a person." This is literally a goal of post-training for all these companies. Even when they are training it to have a character, it mustn't sound like a person. It's no surprise they don't sound as natural as their base counterparts.

https://www.anthropic.com/research/claude-character

>You're wrong that its forced and immutable and a consequence of RLHF and the companies say its so.

I never said it was immutable. I said it was a consequence of post-training and it is. All the base models speak more naturally with much less effort.

>You're especially wrong that RLHF is undesirable

I don't understand what point you're trying to make here. I didn't say it was undesirable. I said it was heavily affecting how natural the models sounded.

>It's also nigh-trivial to get the completion model back

Try getting GPT-4o to write a story with villains that doesn't end with everyone singing Kumbaya and you'll see how much post-training affects the outputs of these models.