Remix clone Hacker News

new | show | ask | jobs Github

	▲	qntty 3 hours ago
		Pre-training mean exposing an already-trained model to more raw text like PDF extracts etc (aka continued pre-training). You wouldn't be starting from scratch, but it's still pre-training because the objective is just next token prediction of the text you expose it to. Post-training means everything else: SFT, DPO, RL, etc. Anything that involves things like prompt/response pairs, reward models, or benefits from human feedback of any kind.
	▲	losvedir 3 hours ago \| parent [-]
		Er, then what is the "already trained" model? I thought pre-training was the gradient descent through the internet part of building foundational models.