▲ | mysterydip a day ago | |||||||
But it's not self preservation. If it instead had trained on a data set full of fiction where the same scenario occurred but the protagonist said "oh well guess I deserve it", then that's what the LLM would autocomplete. | ||||||||
▲ | coke12 a day ago | parent [-] | |||||||
How could you possibly know what an LLM would do in that situation? The whole point is they exhibit occasionally-surprising emergent behaviors so that's why people are testing them like this in the first place. | ||||||||
|