Remix clone Hacker News

new | show | ask | jobs Github

	▲	blackbear_ 3 hours ago
		The GPT3 paper is a good starting point Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 I also enjoyed the papers for DeepSeek and GLM for an overview of all the tricks you need to make these things work DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models https://arxiv.org/abs/2512.02556 GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models https://arxiv.org/abs/2508.06471