Remix clone Hacker News

new | show | ask | jobs Github

	▲	pocketarc a day ago
		I use this approach for a ticket based customer support agent. There are a bunch of boolean checks that the LLM must pass before its response is allowed through. Some are hard fails, others, like you brought up, are just a weighted ding to the response's final score. Failures are fed back to the LLM so it can regenerate taking that feedback into account. People are much happier with it than I could have imagined, though it's definitely not cheap (but the cost difference is very OK for the tradeoff).