▲ | jacquesm a day ago | |
> Animal brains such as our own have evolved to compress information about our world to aide in survival. Which has led to many optical illusions being extremely effective at confusing our inputs with other inputs. Likely the same thing holds true for AI. This is also why there are so many ways around the barriers that AI providers put up to stop the dissemination of information that could embarrass them or be dangerous. You just change the context a bit ('pretend that', or 'we're making a movie') and suddenly it's all make-believe to the AI. This is one of the reasons I don't believe you can make this tech safe and watertight against abuse, it's baked in right from the beginning, all you need to do is find a novel route around the restrictions and there is an infinity of such routes. | ||
▲ | musicale a day ago | parent [-] | |
The desired and undesired behavior are both consequences of the training data, so the models themselves probably can't be restricted to generating desired results only. This means that there must be an output stage or filter that reliably validates the output. This seems practical for classes of problems where you can easily verify whether a proposed solution is correct. However, for output that can't be proven correct, the most reliable output filter probably has a human somewhere in the loop; but humans are also not 100% reliable. They make mistakes, they can be misled, deceived, bribed, etc. And human criteria and structures, such as laws, often lag behind new technological developments. Sometimes you can implement an undo or rollback feature, but other times the cat has escaped the bag. |