| ▲ | charcircuit 3 hours ago | |||||||
The issue is that you can't do unsupervised learning if you require humans. | ||||||||
| ▲ | rhdunn 24 minutes ago | parent [-] | |||||||
LLMs grading the answers is relying on the LLM knowing the answer and not just hallucinating it. You also have issues if/when the model refuses to answer, or if it gets stuck in a loop (e.g. if running locally with a heavily quantized model). I'm investigating/experimenting with using traditional NLP (stanza, spaCy, etc.) to try and grade the responses according to different metrics (is the response in first/second/third person?, is it written as poetry, prose, or drama? etc.). I'm also thinking about using information extraction and synonym detection to handle data queries and the like. | ||||||||
| ||||||||