Remix clone Hacker News

new | show | ask | jobs Github

	▲	visarga 3 days ago
		> In human learning we do this process by generating expectations ahead of time and registering surprise or doubt when those expectations are not met. Why did you give up on this idea. Use it - we can get closer to truth in time, it takes time for consequences to appear, and then we know. Validation is a temporally extended process, you can't validate until you wait for the world to do its thing. For LLMs it can be applied directly. Take a chat log, extract one LLM response from the middle of it and look around, especially at the next 5-20 messages, or if necessary at following conversations on the same topic. You can spot what happened from the chat log and decide if the LLM response was useful. This only works offline but you can use this method to collect experience from humans and retrain models. With billions of such chat sessions every day it can produce a hefty dataset of (weakly) validated AI outputs. Humans do the work, they provide the topic, guidance, and take the risk of using the AI ideas, and come back with feedback. We even pay for the privilege of generating this data.