▲ | musicale a day ago | |
The desired and undesired behavior are both consequences of the training data, so the models themselves probably can't be restricted to generating desired results only. This means that there must be an output stage or filter that reliably validates the output. This seems practical for classes of problems where you can easily verify whether a proposed solution is correct. However, for output that can't be proven correct, the most reliable output filter probably has a human somewhere in the loop; but humans are also not 100% reliable. They make mistakes, they can be misled, deceived, bribed, etc. And human criteria and structures, such as laws, often lag behind new technological developments. Sometimes you can implement an undo or rollback feature, but other times the cat has escaped the bag. |