| ▲ | mindcrime 3 hours ago | |
> I don't quite get it why they can't take another LLM and vet the output of the first with the second one. Yes, this technique and its variations[1][2] "work" but it's still not 100% perfect. And it's not as widely used it might be because, among other reason: a. it takes longer to implement b. it costs more (more tokens spread across multiple llm calls) c. higher latency (getting an answer takes longer due to multiple llm calls involved) d. the final answer is probabilistically more likely to be correct, but is still not guaranteed to be error free, so you can never fully escape the need for Human in the Loop. | ||