| ▲ | LoganDark 2 hours ago | |||||||
Someone needs to train a model where untrusted input uses a completely different set of tokens so that it's entirely impossible for the model to confuse them with instructions. I've never even seen that approach mentioned let alone implemented. | ||||||||
| ▲ | jorl17 an hour ago | parent [-] | |||||||
Perhaps this is in line with what you had in mind? https://patents.google.com/patent/US12118471 | ||||||||
| ||||||||