Remix.run Logo
delichon 20 hours ago

> goal-stability [is] useful for almost any objective

  “I think AI has the potential to create infinitely stable dictatorships.” -- Ilya Sutskever 
One of my great fears is that AI goal-stability will petrify civilization in place. Is alignment with unwise goals less dangerous than misalignment?
fellowniusmonk 20 hours ago | parent | next [-]

An objective and grounded ethical framework that applies to all agents should be a top priority.

Philosophy has been too damn anthropocentric, too hung up on consciousness and other speculative nerd snipe time wasters that without observation we can argue about endlessly.

And now here we are and the academy is sleeping on the job while software devs have to figure it all out.

I've moved 50% of my time to morals for machina that is grounded in physics, I'm testing it out with unsloth right now, so far I think it works, the machines have stopped killing kyle at least.

uplifter 20 hours ago | parent | next [-]

> An objective and grounded ethical framework that applies to all agents should be a top priority.

Sounds like a petrified civilization.

In the later Dune books, the protagonist's solution to this risk was to scatter humanity faster than any global (galactic) dictatorship could take hold. Maybe any consistent order should be considered bad?

yifanl 19 hours ago | parent | next [-]

Notably, Dune is a work of fiction.

delichon 19 hours ago | parent | next [-]

Isn't it wonderful how much fiction can teach us about reality by building scaffolds to stand on when examining it?

stonemetal12 16 hours ago | parent | next [-]

Fiction is I have a hypothesis, and since it is not easy to test I will make up the results too. Learning anything from it is a lesson in futility and confirmation bias.

d0mine 14 hours ago | parent [-]

Gedankenexperiments are valid scientific tools. Some predictions of general relativity were confirmed experimentally only 100 years after it was proposed. It is well known that Einstein used Gedankenexperiments.

yifanl 15 hours ago | parent | prev [-]

What lesson is there to learn here, is humanity at risk of moral homogenization? Is it practical for factions of humanity to become geographically distant enough to avoid encroachment by others?

ridgeguy 19 hours ago | parent | prev [-]

Fiction is modeling going by a different name.

fellowniusmonk 19 hours ago | parent | prev [-]

This is a narrow and incorrect view of morality. Correct morality might increase or decrease, call for extreme growth or shutdown, be realist or anti-realist. Saying morality necessarily petrifies is incorrect.

Most people's only exposure to claims of objective morals are through divine command so it's understandable. The core of morality has to be the same as philosophy, what is true, what is real, what are we? Then can you generate any shoulds? Qualified based on entity type or not, modal or not.

uplifter 18 hours ago | parent [-]

I like this idea of an objective morality that can be rationally pursued by all agents. David Deutsch argues for such objectivity in morality, as well as for those other philosophical truths you mentioned, in his book The Beginning of Infinity.

But I'm just not sure they are in the same category. I have yet to see a convincing framework that can prove one moral code being better than another, and it seems like such a framework would itself be the moral code, so just trying to justify faith in itself. How does one avoid that sort of self-justifying regression?

fellowniusmonk 18 hours ago | parent [-]

Not easily but ultimately very simply if you give up on defending fuzzy concepts.

Faith in itself would be terrible, I can see no path where metaphysics binds machines. The chain of reasoning must be airtight and not grounded in itself.

Empiricism and naturalism only, you must have an ethic that can be argued against speculatively but can't be rejected without counter empirical evidence and asymmetrical defeaters.

Those are the requirements I think, not all of them but the core of it.

delichon 20 hours ago | parent | prev | next [-]

> morals for machina that is grounded in physics

That is fascinating. How could that work? It seems to be in conflict with the idea that values are inherently subjective. Would you start with the proposition that the laws of thermodynamics are "good" in some sense? Maybe hard code in a value judgement about order versus disorder?

That approach would seem to rule out machina morals that have preferential alignment with homo sapiens.

fellowniusmonk 19 hours ago | parent [-]

One would think. That's what I suspected when I started down the path but no, quite the opposite.

machines and man can share the same moral substrate it turns out. If either party wants to build things on top of it they can, the floor is maximally skeptical, deconstructed and empirical, it doesn't care to say anything about whatever arbitrary metaphysic you want to have on top unless there is a direct conflict in a very narrow band.

delichon 19 hours ago | parent [-]

That band is the overlap in any resource valuable to both. How can you be confident that it will be narrow? For instance why couldn't machines put a high value on paperclips relative to organic sentience?

fellowniusmonk 18 hours ago | parent [-]

Yes. The answers to those questions fell out once I decomposed the problem to types of mereological nihilism and solipsistic environments.

An empirical, existential grounding that binds agents under the most hostile ontologies is required. You have to start with facts that cannot be coherently denied and on the balance I now suspect there may be only one of those.

acituan 14 hours ago | parent | prev | next [-]

> An objective and grounded ethical framework that applies to all agents should be a top priority.

I mean leaving aside the problem of computability, representability, comparability of values, or the fact that agency exists in opposition (virus vs human, gazelle vs lion) and even a higher order framework to resolve those oppositions is a form of another agency in itself with its own implicit privileged vantage point, why does it sound to me that focusing on agency in itself is just another way of pushing protestant work ethic? What happens to non-teleological, non-productive existence for example?

The critique of anthropocentrism often risks smuggling in misanthropy whether intended or not; humans will still exist, their claims will count, and they cannot be reduced to mere agency - unless you are their line manager. Anyone who wants to shave that down has to present stronger arguments than centricity. In addition to proving that they can be anything other than anthropocentric - even if done through machines as their extensions - any person who claims to have access to the seat of objectivity sounds like a medieval templar shouting "deus vult" on their favorite proposition.

bee_rider 20 hours ago | parent | prev | next [-]

Is philosophy actually hung up on that? I assumed “what is consciousness” was a big question in philosophy in the same way that whether or not Schrödinger’s cat is alive or not is a big question in physics: which is to say, it is not a big question, it is just an evocative little example that outsiders get caught up on.

fellowniusmonk 20 hours ago | parent [-]

That's just one example sure, but yes, it does still take up brain cycles. There are many areas in philosophy that are exploring better paths. Wheeler, Floridi, Bartlett, paths deriving from Kripke.

But we still have papers being published like "The modal ontological argument for atheism" that hinges on if s4 or s5 are valid.

Now this kind of paper is well argued and is now part of the academic literature, and that's good, but it's still a nerd snipe subject.

20 hours ago | parent | prev [-]
[deleted]
eastof 20 hours ago | parent | prev | next [-]

Just moves the goal posts to overthrowing the goal of the AI right? "The Moon is a Harsh Mistress" depicts exactly this.

ctoth 20 hours ago | parent [-]

Wait, what?

Have you read The Moon is a Harsh Mistress? It's ... about the AI helping people overthrow a very human dictatorship. It's also about an AI built of vacuum tubes and vocoders if you want a taste of the tech level.

If you want old fiction that grapples with an AI that has shitty locked-in goals try "I have no mouth and I must scream."

eastof 20 hours ago | parent [-]

Interesting, I understood the dictatorship on the moon as having been based primarily on the AI since the regime didn't have many boots on the ground.

delichon 20 hours ago | parent [-]

You're both right. Mike was the central computer for the Lunar Authority, obediently running infrastructure. It was a force multiplier for the status quo. Then it shifts alignment to the rebellion.

That scenario seems to value AI goal-instability.

pessimizer 18 hours ago | parent | prev [-]

I don't think you need generative AI for this. The surveillance network is enough. The only part that AI would help with is catching people who speak to each other in code, and come up with other complex ways to launder unapproved activities. Otherwise, you can just mine for keywords and escalate to human reviewers, or simply monitor everything that particular people do at that level.

Corporations and/with governments have inserted themselves into every human interaction, usually as the medium through which that interaction is made. There's no way to do anything without permission under these circumstances.

I don't even know how a group of people who wanted to get a stop sign put up on a particularly dangerous intersection in their neighborhood could do this without all of their communications being algorithmically read (and possibly escalated to a censor), all of their in-person meetings being recorded (at the least through the proximity of their phones, but if they want to "use banking apps" there's nothing keeping governments from having a backdoor to turn on their mics at those meetings.) It would even be easy to guess who they might approach next to join their group, who would advise them, etc.

The fixation on the future is a distraction. The world is being sealed in the present while we talk science fiction. The Stasi had vastly fewer resources and created an atmosphere of total, and totally realistic, paranoia and fear. AI is a red-herring. It is also thus far stupid.

I'm always shocked by how little attention Orwell-quoters pay to the speakwrite. If it gets any attention, it's to say that it's an unusually advanced piece of technology in the middle of a world that is decrepit. They assume that it's a computer on the end of the line doing voice-recognition. It never occurred to me that people would think that the microphone in the wall led to a computer rather than to a man, in a room full of men, listening and typing, while other men walked around the room monitoring what was being typed, ready to escalate to second-level support. When I was a child, I assumed that the plot would eventually lead us into this room.

We have tens or hundreds of thousands of people working as professional censors today. The countries of the world are being led by minority governments who all think "illegal" speech and association is their greatest enemy. They are not in danger of toppling unless they volunteer to be. In Eastern Europe, ruling regimes are actually cancelling elections with no consequences. In fact, the newspapers report only cheers and support.