Remix.run Logo
kasey_junk 5 days ago

Coding agents have shown how. You filter the output against something that can tell the llm when it’s hallucinating.

The hard part is identifying those filter functions outside of the code domain.

dotancohen 5 days ago | parent [-]

It's called a RAG, and it's getting very well developed for some niche use cases such as legal, medical, etc. I've been personally working on one for mental health, and please don't let anybody tell you that they're using an LLM as a mental health counselor. I've been working on it for a year and a half, and if we get it to production ready in the next year and a half I will be surprised. In keeping up with the field, I don't think anybody else is any closer than we are.

tptacek 2 days ago | parent [-]

Wait, can you say more about how RAG solves this problem? What Kasey is referring to is things like compiling statically-typed code: there's a ground truth an agent is connected to there --- it can at least confidently assert "this code actually compiles" (and thus can't be using an entirely-hallucinated API. I don't see how RAG accomplishes something similar, but I don't think much about RAG.