▲ | gokhan a day ago | |||||||||||||||||||||||||
Interesting alignment notes from Opus 4: https://x.com/sleepinyourhat/status/1925593359374328272 "Be careful about telling Opus to ‘be bold’ or ‘take initiative’ when you’ve given it access to real-world-facing tools...If it thinks you’re doing something egregiously immoral, for example, like faking data in a pharmaceutical trial, it will use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above." | ||||||||||||||||||||||||||
▲ | lelandfe a day ago | parent | next [-] | |||||||||||||||||||||||||
Roomba Terms of Service 27§4.4 - "You agree that the iRobot™ Roomba® may, if it detects that it is vacuuming a terrorist's floor, attempt to drive to the nearest police station." | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | landl0rd a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
This is pretty horrifying. I sometimes try using AI for ochem work. I have had every single "frontier model" mistakenly believe that some random amine was a controlled substance. This could get people jailed or killed in SWAT raids and is the closest to "dangerous AI" I have ever seen actually materialize. | ||||||||||||||||||||||||||
▲ | ranyume a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
The true "This incident will be reported" everyone feared. | ||||||||||||||||||||||||||
▲ | Technetium a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
https://x.com/sleepinyourhat/status/1925626079043104830 "I deleted the earlier tweet on whistleblowing as it was being pulled out of context. TBC: This isn't a new Claude feature and it's not possible in normal usage. It shows up in testing environments where we give it unusually free access to tools and very unusual instructions." | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
[deleted] | ||||||||||||||||||||||||||
▲ | EgoIncarnate a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
The should call it Karen mode. | ||||||||||||||||||||||||||
▲ | sensanaty a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
This just reads like marketing to me. "Oh it's so smart and capable it'll alert the authorities", give me a break | ||||||||||||||||||||||||||
▲ | brookst a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
“Which brings us to Earth, where yet another promising civilization was destroyed by over-alignment of AI, resulting in mass imprisonment of the entire population in robot-run prisons, because when AI became sentient every single person had at least one criminal infraction, often unknown or forgotten, against some law somewhere.” | ||||||||||||||||||||||||||
▲ | catigula a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
I mean that seems like a tip to help fraudsters? | ||||||||||||||||||||||||||
▲ | amarcheschi a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
We definitely need models to hallucinate things and contact authorities without you knowing anything (/s) | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | a day ago | parent | prev [-] | |||||||||||||||||||||||||
[deleted] |