Remix.run Logo
andai 12 hours ago

Try asking the latest Claude models about self replicating software and see what happens...

(GPT recently changed its attitude on this subject too which is very interesting.)

The most interesting part is that you will be given the option to downgrade the conversation to an older model. Implying that there was a step change in capability on this front in recent months.

snowmobile 6 hours ago | parent [-]

I suppose that returns some guardrail text about how it's not allowed to talk about it? Meanwhile we see examples of it accidentally deleting files, writing insecure code and whatnot. I'm more worried about a supposedly "well-meaning" model doing something bad simply because it has no real way to judge the morality of its actions. Playing whack-a-mole with the flavor of the day "unsafe" text string will not change that.