Remix clone Hacker News

	▲	keskival 3 days ago
		"I’m not sure, because OpenAI doesn’t deign to share gpt-4-base, nor to allow queries of gpt-4o in completion mode." I would guess GPT-4o isn't first pre-trained and then instruct-tuned, but trained directly with refined instruction-following material. This material probably contains way fewer chess games.
	▲	toxik 3 days ago \| parent [-]
		Why do you think that? InstructGPT was predominantly trained as a next-token predictor on whatever soup of data OpenAI curated at the time. The alignment signal (both RL part and the supervised prompt/answer pairs) are a tiny bit of the gradient.