Remix clone Hacker News

new | show | ask | jobs Github

	▲	lllllm 3 days ago
		The pretraining (so 99% of training) is fully global, in over 1000 languages without special weighting. The posttraining (See section 4 of the paper) had also as many languages as we could get, and did upweight some languages. The posttraining can easily be customized to any other target languages