▲ | jmogly 3 days ago | |
To me it’s a problem of if a piece of information is not well represented in the training data the llm will always tend towards bad token predictions for related to said information. I think the next big thing in LLM’s could be figuring out how to tell if a token was just a “fill in” or “guess” vs a well predicted token. That way you can have some sort of governor that can kill a response if it is getting too guessy, or atleast provide some other indication that the provided tokens are likely hallucinated. Maybe there is some way to do it based on the geometry of how the neural net activated for a token, or some other more statistics based approach, idk I’m not an expert. | ||
▲ | photonthug 2 days ago | parent [-] | |
A related topic you might want to look into here is called nucleus sampling. Similar to temperature but also different.. it's been surprising to me that people don't talk about it more often, and that lots of systems won't expose the knobs for it. |