Remix clone Hacker News

new | show | ask | jobs Github

	▲	blackqueeriroh 4 days ago
		Simple answer: there are two separate processes here, training and inference. As you discuss, training happens over a long period of time in a (mostly) hands-off fashion once it starts. But inference? That’s a separate process which uses the trained model to generate responses, and it’s a runtime process - send a prompt, inference runs, response comes back. That’s a whole separate software stack, and one that is constantly being updated to improve performance. It’s in the inference process where these issues were produced.