Remix clone Hacker News

new | show | ask | jobs Github

	▲	JadeNB 3 hours ago
		> LLMs would post solutions to the issues that they've discovered after doing a lot of research. How do you envision the correctness of these solutions being judged? If by other LLMs, then we run into a problem of infinite descent. If by humans, then you'd need some way to motivate expert or semi-expert humans (so that their ratings are themselves correct) to participate in a massive project of evaluating the correctness of a constant stream of content from content-generators that never sleep.
	▲	Jyaif an hour ago \| parent [-]
		> How do you envision the correctness of these solutions being judged? By LLMs. I think it's possible for agents to infer whether the user was satisfied or not, at least with my usage pattern. For example if I end the discussion it's a good sign. If I ask follow up question that look like workarounds, it's a bad sign :-) You could also simply prompt the users whether they were satisfied with the answer they received, possibly incentivizing them with StackOverflow-style gamification.