▲ | simianwords 7 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
i don't think this is correct - such training data is usually made at SFT level after unsupervised learning on all available data in the web. the SFT level dataset is manually curated meaning there would be conscious effort to create more training samples of the form to say "i'm not sure". same with RLHF. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | therein 7 days ago | parent [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You mean I don't think this is automatically correct. Otherwise it very likely is correct. Either way, you're guessing the manual curation is done in a way that is favorable to include I don't know answers. Which it most likely doesn't. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|