Remix.run Logo
jl6 8 days ago

I don’t think it’s just (or even particularly) bad axioms, I think it’s that people tend to build up “logical” conclusions where they think each step is a watertight necessity that follows inevitably from its antecedents, but actually each step is a little bit leaky, leading to runaway growth in false confidence.

Not that non-rationalists are any better at reasoning, but non-rationalists do at least benefit from some intellectual humility.

dan_quixote 8 days ago | parent | next [-]

As a former mechanical engineer, I visualize this phenomenon like a "tolerance stackup". Effectively meaning that for each part you add to the chain, you accumulate error. If you're not damn careful, your assembly of parts (or conclusions) will fail to measure up to expectations.

godelski 7 days ago | parent | next [-]

I like this approach. Also having dipped my toes in the engineering world (professionally) I think it naturally follows that you should be constantly rechecking your designs. Those tolerances were fine to begin with, but are they now that things have changed? It also makes you think about failure modes. What can make this all come down and if it does what way will it fail? Which is really useful because you can then leverage this to design things to fail in certain ways and now you got a testable hypothesis. It won't create proof, but it at least helps in finding flaws.

isleyaardvark 7 days ago | parent [-]

The example I heard was to picture the Challenger shuttle, and the O-rings used worked 99% of the time. Well, what happens to the failure rate when you have 6 O-rings in a booster rocket, and you only need one to fail for disaster? Now you only have a 94% success rate.

godelski 6 days ago | parent [-]

IIRC the Challenger o-ring problem was much more deterministic. That the flaw was known and caused by the design not considering the actual operational temperature range. Which, I think there's a good lesson to learn there (and from several NASA failure): the little things matter. It's idiotic to ignore a $10 fix if the damage would cost billions of dollars.

But I still think your point is spot on and that's really what matters haha

ctkhn 7 days ago | parent | prev | next [-]

Basically the same as how dead reckoning your location works worse the longer you've been traveling?

toasterlovin 7 days ago | parent [-]

Dead reckoning is a great analogy for coming to conclusions based on reason alone. Always useful to check in with reality.

ethbr1 7 days ago | parent [-]

And always worth keeping an eye on the maximum possible divergence from reality you're currently at, based on how far you've reasoned from truth, and how less-than-sure each step was.

Maybe you're right. But there's a non-zero chance you're also max wrong. (Which itself can be bounded, if you don't wander too far)

toasterlovin 7 days ago | parent [-]

My preferred argument against the AI doom hypothesis is exactly this: it has 8 or so independent prerequisites with unknown probabilities. Since you multiply the probabilities of each prerequisite to get the overall probability, you end up with a relatively low overall probability even when the probability of each prerequisite is relatively high, and if just a few of the prerequisites have small probabilities, the overall probability basically can’t be anything other than very small.

Given this structure to the problem, if you find yourself espousing a p(doom) of 80%, you’re probably not thinking about the issue properly. If in 10 years some of those prerequisites have turned out to be true, then you can start getting worried and be justified about it. But from where we are now there’s just no way.

robocat 7 days ago | parent | prev | next [-]

I saw an article recently that talked about stringing likely inferences together but ending up with an unreliable outcome because enough 0.9 probabilities one after the other lead to an unlikely conclusion.

Edit: Couldn't find the article, but AI referenced Baysian "Chain of reasoning fallacy".

godelski 7 days ago | parent | next [-]

I think you have this oversimplified. Stringing together inferences can take us in either direction. It really depends on how things are being done and this isn't always so obvious or simple. But just to show both directions I'll give two simple examples (real world holds many more complexities)

It is all about what is being modeled and how the inferences string together. If these are being multiplied, then yes, this is going to decreases as xy < x and xy < y for every x,y < 1.

But a good counter example is the classic Bayesian Inference example[0]. Suppose you have a test that detects vampirism with 95% accuracy (Pr(+|vampire) = 0.95) and has a false positive rate of 1% (Pr(+|mortal) = 0.01). But vampirism is rare, affecting only 0.1% of the population. This ends up meaning a positive test only gives us a 8.7% likelihood of a subject being a vampire (Pr(vampire|+). The solution here is that we repeat the testing. On our second test Pr(vampire) changes from 0.001 to 0.087 and Pr(vampire|+) goes to 89% and a third getting us to about 99%.

[0] Our equation is

                  Pr(+|vampire)Pr(vampire)
  Pr(vampire|+) = ------------------------
                           Pr(+)
And the crux is Pr(+) = Pr(+|vampire)Pr(vampire) + Pr(+|mortal)(1-Pr(vampire))
p1necone 7 days ago | parent | next [-]

Worth noting that solution only works if the false positives are totally random, which is probably not true of many real world cases and would be pretty hard to work out.

godelski 7 days ago | parent [-]

Definitely. Real world adds lots of complexities and nuances, but I was just trying to make the point that it matters how those inferences compound. That we can't just conclude that compounding inferences decreases likelihood

Dylan16807 7 days ago | parent [-]

Well they were talking about a chain, A->B, B->C, C->D.

You're talking about multiple pieces of evidence for the same statement. Your tests don't depend on any of the previous tests also being right.

godelski 7 days ago | parent [-]

Be careful with your description there, are you sure it doesn't apply to the Bayesian example (which was... illustrative...? And not supposed to be every possible example?)? We calculated f(f(f(x))), so I wouldn't say that this "doesn't depend on the previous 'test'". Take your chain, we can represent it with h(g(f(x))) (or (f∘g∘h)(x)). That clearly fits your case for when f=g=h. Don't lose sight of the abstractions.

Dylan16807 7 days ago | parent [-]

So in your example you can apply just one test result at a time, in any order. And the more pieces of evidence you apply, the stronger your argument gets.

f = "The test(s) say the patient is a vampire, with a .01 false positive rate."

f∘f∘f = "The test(s) say the patient is a vampire, with a .000001 false positive rate."

In the chain example f or g or h on its own is useless. Only f∘g∘h is relevant. And f∘g∘h is a lot weaker than f or g or h appears on its own.

This is what a logic chain looks like, adapted for vampirism to make it easier to compare:

f: "The test says situation 1 is true, with a 10% false positive rate."

g: "If situation 1 then situation 2 is true, with a 10% false positive rate."

h: "If situation 2 then the patient is a vampire, with a 10% false positive rate."

f∘g∘h = "The test says the patient is a vampire, with a 27% false positive rate."

So there are two key differences. One is the "if"s that make the false positives build up. The other is that only h tells you anything about vampires. f and g are mere setup, so they can only weaken h. At best f and g would have 100% reliability and h would be its original strength, 10% false positive. The false positive rate of h will never be decreased by adding more chain links, only increased. If you want a smaller false positive rate you need a separate piece of evidence. Like how your example has three similar but separate pieces of evidence.

godelski 7 days ago | parent [-]

Again, my only argument was that you can have both situations occur. We could still construct a f∘g∘h to increase probability if we want. I'm not saying it cannot go down, I'm saying there's no absolute rule you can follow.

Dylan16807 7 days ago | parent [-]

I don't think you can make a chain of logic f∘g∘h where the probability of the combined function is higher than the probability of f or g or h on their own.

Chain of logic meaning that only the last function updates the probability you care about, and the preceeding functions give you intermediate information that is only useful to feed into the next function.

It is an absolute rule you can follow, as long as you're applying it the way it was intended, to a specific organization of functions. It's not any kind of combining, it's A->B->C->D combining. As opposed to multiple pieces that each independently imply D.

Just because you can use ∘ in both situations doesn't make them the same. Whether x∘y∘z is chaining depends on what x and y and z do. If all of them update the same probability, that's not chaining. If removing any of them would leave you with no information about your target probability, then it's chaining.

TL;DR: ∘ doesn't tell you if something is a chain, you're conflating chains with non-chains, the rule is useful when it comes to chains

godelski 6 days ago | parent [-]

I'm not disagreeing with you. You understand that, right?

The parent was talking about stringing together inferences. My argument *was how you string them together matters*. That's all. I said "context matters."

I tried to reiterate this in my previous comment. So let's try one more time. Again, I'm not going to argue you're wrong. I'm going to argue that more context is needed to determine if likelihood increases or decreases. I need to stress this before moving on.

Let's go one more comment back, when I'm asking you if you're sure that this doesn't apply to the Bayesian case too. My point here was that, again, context matters. Are these dependent or independent? My whole point is that we don't know which direction things will go in without additional context. I __am not__ making the point that it always gets better like in the Bayesian example. The Bayesian case was _an example_. I also gave an example for the other case. So why focus on one of these and ignore the other?

  > ∘ doesn't tell you if something is a chain
∘ is the composition operator (at least in this context and you also interpreted it that way). So yes, yes it does. It is the act of chaining together functions. Hell, we even have "the chain rule" for this. Go look at the wiki if you don't believe me, or any calculus book. You can go into more math and you'll see the language change to use maps to specify the transition process.

  >  It's not any kind of combining, it's A->B->C->D combining.
Yes, yes it does. The *events* are independent but the *states* are dependent. Each test does not depend on the previous test, making the tests independent, but our marginal is! Hell, you see this in basic Markov Chains too. The decision process does not depend on other nodes in the chain but the state does. If you want to draw our Bayesian example as a chain you can do so. It's going to be really fucking big because you're going to need to calculate all potential outcomes making it both infinitely wide and infinitely deep, but you can. The inference process allows us to skip all those computations and lets us focus on only performing calculations for states we transition into.

Just ask yourself, how did you get to state B? *You drew arrows for a reason*. But arrows only tell us about a transition occurring, they do not tell us about that transition process. They lack context.

  > you're conflating chains with non-chains
No, you're being too strict in your definition of "chain". Which, brings us back to my first comment.

Look, we can still view both situations from the perspective of Markov Chains. We can speak about this with whatever language we want but if you want chains let's use something that is clearly a chain. Our classic MC is the easy case, right? Our state only depends on the previous state, right? P(x_{t}|x_{t-1}). Great, just like the Bayesian case (our state is dependent but our transition function is independent). So we can also have higher order MCs, depending on any n previous state. We can extend our transition function too. P(x_{t}|x_{t-1},...,x_0) = Q. We don't have to restrict ourselves to Q(x_{t-1}), we can do whatever the hell we want. In fact, our simple MC process is going to be equivalent to Q(x_{t-1},...,x_0) it is just that nothing ends up contributing except for that x_{t-1}. The process is still the same, but the context matters.

  >  It's not any kind of combining, it's A->B->C->D combining. ***As opposed to multiple pieces that each independently imply D.***
This tells me you drew your chain wrong. If multiple things are each contributing to D independently then that is not A->B->C->D (or as you wrote the first time: `A->B, B->C, C->D`, which is equivalent!) you instead should have written something like A -> C <- B. Or using all 4 letters

       B
       |
       v
  A -> D <- C
These are completely different things! This is not a sequential process. This is not (strictly) composition.

And yet, again, we still do not know if these are decreasing. They will decrease if A,B,C,D ∈ ℙ AND our transition functions are multiplicative (∏ x_i < x_j ∀ j ; where x_i ∈ ℙ), but this will not happen if the transition function is additive (∑ x_i ≥ x_j ∀ j ; where x_i ∈ ℙ)

We are still entirely dependent upon context.

Now, we're talking about LLMs, right? Your conversation (and CoT) is much closer to the Bayesian case than our causal DAG with dependence. Yes, the messages in the conversation transition us through states, but the generation is independent. The prompt and context lengthen, but this is not the same thing as the events being dependent. The LLM response is an independent event. Like the BI case the state has changed, but the generation event is identical (i.e. independent). We don't care how we got to the current state! You don't need to have the conversation with the LLM. Every inference from the LLM is independent, even if the state isn't. The inference only depends on the tokens currently in context. Assuming you turn on deterministic mode (setting seeds identically), you could generate an identical output by passing the conversation (and properly formatting) into a brand new fresh prompt. That shows that the dependence is on state, not inference. Just like our Bayesian example you'd generate the same output if you start from the same state. The independence is because we don't care how we got to that state, only that we are at that state (same with simple MCs). There are added complexities that can change this but we can't go there if we can't get to this place first. We'd need to have this clear before we can add complexities like memory and MoEs because the answer only gets more nuanced.

So again, our context really matters here and the whole conversation is about how these subtleties matter. The question was, if those errors compound. I hope you see that that's not so simple to answer. *Personally*, I'm pretty confident they will in current LLMs, because they rely far too heavily on their prompting (it'll give you incorrect answers if you prime it that way despite being able to give correct answers with better prompting) but this isn't a necessary condition now, is it?

TLDR: We can't determine if likelihood increases or decreases without additional context

Dylan16807 6 days ago | parent [-]

I'll try to keep this simple.

> I'm not disagreeing with you. You understand that, right?

We disagree about whether context can make a difference, right?

> The parent was talking about stringing together inferences. My argument was how you string them together matters. That's all. I said "context matters."

> TLDR: We can't determine if likelihood increases or decreases without additional context

The situations you describe where inference acts differently do not fall under the "stringing together"/"chaining" they were originally talking about. Context never makes their original statement untrue. Chaining always makes evidence weaker.

To be extra clear, it's not about whether the evidence pushes your result number up or down, it's that the likelihood of the evidence itself being correct drops.

> It is the act of chaining together functions.

They were not talking about whether something is composition or not. When they said "string" and "chain" they were talking about a sequence of inferences where each one leads to the next one.

Composition can be used in a wide variety of contexts. You need context to know if composition weakens or strengthens arguments. But you do not need context to know if stringing/chaining weakens or strengthens.

> No, you're being too strict in your definition of "chain".

No, you're being way too loose.

> This tells me you drew your chain wrong. If multiple things are each contributing to D independently then that is not A->B->C->D

??? Of course those are different. That's why I wrote "as opposed to".

> I also gave an example for the other case. So why focus on one of these and ignore the other?

I'm focused on the one you called a "counter example" because I'm arguing it's not an example.

If you specifically want me to address "If these are being multiplied, then yes, this is going to decreases as xy < x and xy < y for every x,y < 1." then yes that's correct. I never doubted your math, and everyone agrees about that one.

TL;DR:

At this point I'm mostly sure we're only disagreeing about the definition of stringing/chaining? If yes, oops sorry I didn't mean to argue so much about definitions. If not, then can you give me an example of something I would call a chain where adding a step increases the probability the evidence is correct?

And I have no idea why you're talking about LLMs.

godelski 6 days ago | parent [-]

  > I'm mostly sure we're only disagreeing about the definition of stringing/chaining? 
Correct.

  > No, you're being way too loose.
Okay, instead of just making claims and for me to trust you, go point to something concrete. I've even tried to google, but despite my years of study in statistics, metric theory, and even mathematical logic I'm at a loss in finding your definition.

I'm aware of the Chain Rule of Probability, but this isn't the only type you will find the term "chain" in statistics. Hell, the calculus Chain Rule is still used there too! So forgive me for being flustered but you are literally arguing to me that a Markov Chain isn't a chain. Maybe I'm having a stroke, but I'm pretty sure the word "chain" is in Markov Chain.

Dylan16807 6 days ago | parent [-]

> Okay, instead of just making claims and for me to trust you, go point to something concrete. I've even tried to google, but despite my years of study in statistics, metric theory, and even mathematical logic I'm at a loss in finding your definition.

Let's look again at what we're talking about:

>>> I think it’s that people tend to build up “logical” conclusions where they think each step is a watertight necessity that follows inevitably from its antecedents, but actually each step is a little bit leaky, leading to runaway growth in false confidence.

>> As a former mechanical engineer, I visualize this phenomenon like a "tolerance stackup". Effectively meaning that for each part you add to the chain, you accumulate error.

> I saw an article recently that talked about stringing likely inferences together but ending up with an unreliable outcome because enough 0.9 probabilities one after the other lead to an unlikely conclusion.

> Edit: Couldn't find the article, but AI referenced Baysian "Chain of reasoning fallacy".

The only term in there you could google is "tolerance stackup". The rest is people making ad-hoc descriptions of things. Except for "Chain of reasoning fallacy" which is a fake term. So I'm not surprised you didn't find anything in google, and I can't provide you anything from google. There is nothing "concrete" to ask for when it comes to some guy's ad-hoc description, you just have to read it and do your best.

And everything I said was referring back to those posts, primarily the last one by robocat. I was not introducing anything new when I used the terms "string" and "chain". I was not referring to any scientific definitions. I was only talking about the concept described by those three posts.

Looking back at those posts, I will confidently state that the concept they were talking about does not include markov chains. You're not having a stroke, it's just a coincidence that the word "chain" can be used to mean multiple things.

godelski 6 days ago | parent [-]

I googled YOUR terms. And if you read my messages you'd notice that I'm not a novice when it comes to math. Hell, you should have gotten that from my very first comment. I was never questioning if I had a stroke, I was questioning your literacy.

  > I was not referring to any scientific definitions.
Yet, you confidently argued against ones that were stated.

If you're going to speak out your ass, at least have the decency to let everyone know first.

Dylan16807 6 days ago | parent [-]

They were never my terms. They were the terms from the people that were having a nice conversation before you interrupted.

You told them they were wrong, that it could go either way.

That's not true.

What they were talking about cannot go either way.

You were never talking about the same thing as them. I gave you the benefit of the doubt by thinking you were trying to talk about the same thing as them. Apparently I shouldn't have.

You can't win this on definitions. They were talking about a thing without using formal definitions, and you replied to them with your own unrelated talk, as if it was what they meant. No. You don't get to change what they meant.

That's why I argued against your definition. Your definition is lovely in some other conversation. Your definition is not what they meant, and cannot override what they meant.

wombatpm 7 days ago | parent | prev | next [-]

Can’t you improve thing if you can calibrate with a known good vampire? You’d think NIST or the CDC would have one locked in a basement somewhere.

godelski 7 days ago | parent | next [-]

IDK, probably? I'm just trying to say that iterative inference doesn't strictly mean decreasing likelihood.

I'm not a virologist or whoever designs these kinds of medical tests. I don't even know the right word to describe the profession lol. But the question is orthogonal to what's being discussed here. I'm only guessing "probably" because usually having a good example helps in experimental design. But then again, why wouldn't the original test that we're using have done that already? Wouldn't that be how you get that 95% accurate test?

I can't tell you the biology stuff, I can just answer math and ML stuff and even then only so much.

weard_beard 7 days ago | parent | prev | next [-]

GPT6 would come faster but we ran out of Casandra blood.

ethbr1 7 days ago | parent | prev [-]

The thought of a BIPM Reference Vampire made me chuckle.

tintor 7 days ago | parent | prev [-]

Assuming your vampire tests are independent.

godelski 7 days ago | parent [-]

Correct. And there's a lot of other assumptions. I did make a specific note that it was a simplified and illustrative example. And yes, in the real world I'd warn about being careful when making i.i.d. assumptions, since these assumptions are made far more than people realize.

7 days ago | parent | prev [-]
[deleted]
to11mtm 7 days ago | parent | prev | next [-]

I like this analogy.

I think of a bike's shifting systems; better shifters, better housings, better derailleur, or better chainrings/cogs can each 'improve' things.

I suppose where that becomes relevant to here, is that you can have very fancy parts on various ends but if there's a piece in the middle that's wrong you're still gonna get shit results.

dylan604 7 days ago | parent [-]

You only as strong as the weakest link.

Your SCSI devices are only as fast as the slowest device in the chain.

I don't need to be faster than the bear, I only have to be faster than you.

jandrese 7 days ago | parent [-]

> Your SCSI devices are only as fast as the slowest device in the chain.

There are not many forums where you would see this analogy.

guerrilla 7 days ago | parent | prev [-]

This is what I hate about real life electronics. Everything is nice on paper, but physics sucks.

godelski 7 days ago | parent [-]

  > Everything is nice on paper
I think the reason this is true is mostly because how people do things "on paper". We can get much more accurate with "on paper" modeling, but the amount of work increases very fast. So it tends to be much easier to just calculate things as if they are spherical chickens in a vacuum and account for error than it is to calculate including things like geometry, drag, resistance, and all that other fun jazz (which you still will also need to account for error/uncertainty though this now can be smaller).

Which I think at the end of the day the important lesson is more how simple explanations can be good approximations that get us most of the way there but the details and nuances shouldn't be so easily dismissed. With this framing we can choose how we pick our battles. Is it cheaper/easier/faster to run a very accurate sim or cheaper/easier/faster to iterate in physical space?

godelski 8 days ago | parent | prev | next [-]

  > I don’t think it’s just (or even particularly) bad axioms
IME most people aren't very good at building axioms. I hear a lot of people say "from first principles" and it is a pretty good indication that they will not be. First principles require a lot of effort to create. They require iteration. They require a lot of nuance, care, and precision. And of course they do! They are the foundation of everything else that is about to come. This is why I find it so odd when people say "let's work from first principles" and then just state something matter of factly and follow from there. If you want to really do this you start simple, attack your own assumptions, reform, build, attack, and repeat.

This is how you reduce the leakiness, but I think it is categorically the same problem as the bad axioms. It is hard to challenge yourself and we often don't like being wrong. It is also really unfortunate that small mistakes can be a critical flaw. There's definitely an imbalance.

  >> The smartest people I have ever known have been profoundly unsure of their beliefs and what they know.
This is why the OP is seeing this behavior. Because the smartest people you'll meet are constantly challenging their own ideas. They know they are wrong to at least some degree. You'll sometimes find them talking with a bit of authority at first but a key part is watching how they deal with challenging of assumptions. Ask them what would cause them to change their minds. Ask them about nuances and details. They won't always dig into those can of worms but they will be aware of it and maybe nervousness or excited about going down that road (or do they just outright dismiss it?). They understand that accuracy is proportional to computation, and you have exponentially increasing computation as you converge on accuracy. These are strong indications since it'll suggest if they care more about the right answer or being right. You also don't have to be very smart to detect this.
joe_the_user 7 days ago | parent [-]

IME most people aren't very good at building axioms.

It seems you implying that some people are good building good axiom systems for the real world. I disagree. There are a few situations in the world where you have generalities so close to complete that you can use simple logic on them. But for the messy parts of the real world, there simply is not set of logical claims which can provide anything like certainty no matter how "good" someone is at "axiom creation".

godelski 7 days ago | parent [-]

I don't even know what you're arguing.

  > you implying that some people are good building good axiom systems
How do you go from "most people aren't very good" to "this implies some people are really good"? First, that is just a really weird interpretation of how people speak (btw, "you're" not "you" ;) because this is nicer and going to be received better than "making axioms is hard and people are shit at it." Second, you've assumed a binary condition. Here's an example. "Most people aren't very good at programming." This is an objectively true statement, right?[0] I'll also make the claim that no one is a good programmer, but some programmers are better than others. There's no contradiction in those two claims, even if you don't believe the latter is true.

Now, there are some pretty good axiom systems. ZF and ZFC seems to be working pretty well. There's others too and they are used to for pretty complex stuff. They all work at least for "simple logic."

But then again, you probably weren't thinking of things like ZFC. But hey, that was kinda my entire point.

  > there simply is not set of logical claims which can provide anything like certainty no matter how "good" someone is at "axiom creation".
 
I agree. I'd hope I agree considering my username... But you've jumped to a much stronger statement. I hope we both agree that just because there are things we can't prove that this doesn't mean there aren't things we can prove. Similarly I hope we agree that if we couldn't prove anything to absolute certainty that this doesn't mean we can't prove things to an incredibly high level of certainty or that we can't prove something is more right than something else.

[0] Most people don't even know how to write a program. Well... maybe everyone can write a Perl program but let's not get into semantics.

joe_the_user 7 days ago | parent | next [-]

I think I misunderstood that you talking of axiomatization of mathematical or related systems.

The original discussion are about the formulation of "axioms" about the real world ("the bus always X minutes late" or more elaborate stuff). I suppose I should have considered with your username, you would have consider the statement in terms of the formulation of mathematical axioms.

But still, I misunderstood you and you misunderstood me.

godelski 7 days ago | parent [-]

  > you talking of axiomatization of mathematical or related systems.
Why do you think these are so different? Math is just a language in which we are able to formalize abstraction. Sure, it is pedantic as fuck, but that doesn't make it "not real world". If you want to talk about the bus always being late you just do this distributionally. Probabilities are our formalization around uncertainty.

We're talking about "rationalist" cults, axioms, logic, and "from first principles", I don't think using a formal language around this stuff is that much of a leap, if any. (Also, not expecting you to notice my username lol. But I did mention it because after the fact it would make more sense and serve as a hint to where I'm approaching this from).

joe_the_user 6 days ago | parent [-]

Why do you think these are so different?

Because "reality" doesn't have "atomic", certain, etc operations? Also, it's notable that since most reasonings about the real world are approximate, the law of excluded middle is much less likely to apply.

If you want to talk about the bus always being late you just do this distributionally. Probabilities are our formalization around uncertainty.

Ah, but you can't be certain that you're dealing with a given distribution, not outside the quantum realm. You can talk about, you can roughly model, real world phenomena with second order or higher kind of statements. But you can't just use axioms

We're talking about "rationalist" cults, axioms, logic, and "from first principles", I don't think using a formal language around this stuff is that much of a leap, if any.

Sure, this group used (improperly) all sorts of logical reasoning and so one might well formal language to describe their (less than useful) beliefs. But this discussion began with the point some made that their use of axiomatic reasoning indeed lead to less than useful outcomes.

godelski 6 days ago | parent [-]

  > Because "reality" doesn't have "atomic", certain, etc operations?
That's not a requirement. The axioms are for our modeling, not reality.

  > but you can't be certain that you're dealing with a given distribution, not outside the quantum realm.
I guess I'll never understand why non-physicists want to talk so confidently about physics. Especially quantum mechanics[0]. You can get through Griffiths with mostly algebra and some calculus. Group theory is a big plus, but not necessary. I also suggest having a stiff drink on hand. Sometimes you'll need to just shut up and do the math. Don't worry, it'll only be more confusing years later if you get to Messiah.

[0] https://xkcd.com/451/

Dylan16807 7 days ago | parent | prev [-]

If you mean nobody is good at something, just say that.

Saying most people aren't good at it DOES imply that some are good at it.

guerrilla 7 days ago | parent | prev | next [-]

> I don’t think it’s just (or even particularly) bad axioms, I think it’s that people tend to build up “logical” conclusions where they think each step is a watertight necessity that follows inevitably from its antecedents, but actually each step is a little bit leaky, leading to runaway growth in false confidence.

This is what you get when you naively re-invent philosophy from the ground up while ignoring literally 2500 years of actual debugging of such arguments by the smartest people who ever lived.

You can't diverge from and improve on what everyone else did AND be almost entirely ignorant of it, let alone have no training whatsoever in it. This extreme arrogance I would say is the root of the problem.

BeFlatXIII 8 days ago | parent | prev | next [-]

> Not that non-rationalists are any better at reasoning, but non-rationalists do at least benefit from some intellectual humility.

Non-rationalists are forced to use their physical senses more often because they can't follow the chain of logic as far. This is to their advantage. Empiricism > rationalism.

whatevertrevor 7 days ago | parent | next [-]

That conclusion presupposes that rationality and empiricism are at odds or mutually incompatible somehow. Any rational position worth listening to, about any testable hypothesis, is hand in hand with empirical thinking.

guerrilla 7 days ago | parent [-]

In traditional philosophy, rationalism and empiricism are at odds; they are essentially diametrically opposed. Rationalism prioritizes a priori reasoning while empiricism prioritizes a posteriori reasoning. You can prioritize both equally but that is neither rationalism nor empiricism in the traditional terminology. The current rationalist movement has no relation to that original rationalist movement, so the words don't actually mean the same thing. In fact, the majority of participants in the current movement seem ignorant of the historical dispute and its implications, hence the misuse of the word.

BlueTemplar 7 days ago | parent | next [-]

Yeah, Stanford has a good recap :

https://plato.stanford.edu/entries/rationalism-empiricism/

(Note also how the context is French vs British, and the French basically lost with Napoleon, so the current "rationalists" seem to be more likely to be heirs to empiricism instead.)

whatevertrevor 7 days ago | parent | prev [-]

Thank you for clarifying.

That does compute with what I thought the "Rationalist" movement as covered by the article was about. I didn't peg them as pure a priori thinkers as you put it. I suppose my comment still holds, assuming the rationalist in this context refers to the version of "Rationalism" being discussed in the article as opposed to the traditional one.

om8 7 days ago | parent | prev | next [-]

Good rationalism includes empiricism though

ehmrb 8 days ago | parent | prev [-]

[dead]

danaris 8 days ago | parent | prev | next [-]

> I think it’s that people tend to build up “logical” conclusions where they think each step is a watertight necessity that follows inevitably from its antecedents, but actually each step is a little bit leaky, leading to runaway growth in false confidence.

Yeah, this is a pattern I've seen a lot of recently—especially in discussions about LLMs and the supposed inevitability of AGI (and the Singularity). This is a good description of it.

kergonath 8 days ago | parent | next [-]

Another annoying one is the simulation theory group. They know just enough about Physics to build sophisticated mental constructs without understanding how flimsy the foundations are or how their logical steps are actually unproven hypotheses.

JohnMakin 8 days ago | parent [-]

Agreed. This one is especially annoying to me and dear to my heart, because I enjoy discussing the philosophy behind this, but it devolves into weird discussions and conclusions fairly quickly without much effort at all. I particularly enjoy the tenets of certain sects of buddhism and how they view these things, but you'll get a lot of people that are doing a really pseudo-intellectual version of the Matrix where they are the main character.

BaseBaal 7 days ago | parent [-]

Which sects of Buddhism? Just curious to read further about them.

spopejoy 7 days ago | parent | prev [-]

You might have just explained the phenomenon of AI doomsayers overlapping with ea/rat types, which I otherwise found inexplicable. EA/Rs seem kind of appalingly positivist otherwise.

danaris 6 days ago | parent [-]

I mean, that's also because of their mutual association with Eliezer Yudkowski, who is (AIUI) a believer in the Singularity, as well as being one of the main wellsprings of "Rationalist" philosophy.

tibbar 8 days ago | parent | prev | next [-]

Yet I think most people err in the other direction. They 'know' the basics of health, of discipline, of charity, but have a hard time following through. 'Take a simple idea, and take it seriously': a favorite aphorism of Charlie Munger. Most of the good things in my life have come from trying to follow through the real implications of a theoretical belief.

bearl 7 days ago | parent [-]

And “always invert”! A related mungerism.

more_corn 7 days ago | parent [-]

I always get weird looks when I talk about killing as many pilots as possible. I need a new example of the always invert model of problem solving.

analog31 8 days ago | parent | prev | next [-]

Perhaps part of being rational, as opposed to rationalist, is having a sense of when to override the conclusions of seemingly logical arguments.

1attice 7 days ago | parent [-]

In philosophy grad school, we described this as 'being reasonable' as opposed to 'being rational'.

That said, big-R Rationalism (the Lesswrong/Yudkowsky/Ziz social phenomenon) has very little in common with what we've standardly called 'rationalism'; trained philosophers tend to wince a little bit when we come into contact with these groups (who are nevertheless chockablock with fascinating personalities and compelling aesthetics.)

From my perspective (and I have only glancing contact,) these mostly seem to be _cults of consequentialism_, an epithet I'd also use for Effective Altruists.

Consequentialism has been making young people say and do daft things for hundreds of years -- Dostoevsky's _Crime and Punishment_ being the best character sketch I can think of.

While there are plenty of non-religious (and thus, small-r rationalist) alternatives to consequentialism, none of them seem to make it past the threshold in these communities.

The other codesmell these big-R rationalist groups have for me, and that which this article correctly flags, is their weaponization of psychology -- while I don't necessarily doubt the findings of sociology, psychology, etc, I wonder if they necessarily furnish useful tools for personal improvement. For example, memorizing a list of biases that people can potentially have is like numbering the stars in the sky; to me, it seems like this is a cargo-cultish transposition of the act of finding _fallacies in arguments_ into the domain of finding _faults in persons_.

And that's a relatively mild use of psychology. I simply can't imagine how annoying it would be to live in a household where everyone had memorized everything from connection theory to attachment theory to narrative therapy and routinely deployed hot takes on one another.

In actual philosophical discussion, back at the academy, psychologizing was considered 'below the belt', and would result in an intervention by the ref. Sometimes this was explicitly associated with something we called 'the Principle of Charity', which is that, out of an abundance of epistemic caution, you commit to always interpreting the motives and interests of your interlocutor in the kindest light possible, whether in 'steel manning' their arguments, or turning a strategically blind eye to bad behaviour in conversation.

The importance Principle of Charity is probably the most enduring lesson I took from my decade-long sojurn among the philosophers, and mutual psychological dissection is anathema to it.

throw4847285 7 days ago | parent | next [-]

I actually think that the fact that rationalists use the term "steel manning" betrays a lack of charity.

If the only thing you owe your interlocutor is to use your "prodigious intellect" to restate their own argument in the way that sounds the most convincing to you, maybe you are in fact a terrible listener.

Eliezer 7 days ago | parent | next [-]

I have tried to tell my legions of fanatic brainwashed adherents exactly this, and they have refused to listen to me because the wrong way is more fun for them.

https://x.com/ESYudkowsky/status/1075854951996256256

Dylan16807 7 days ago | parent | prev | next [-]

Listening to other viewpoints is hard. Restating is a good tool to improve listening and understanding. I don't agree with this criticism at all, since that "prodigious intellect" bit isn't inherent to the term.

throw4847285 7 days ago | parent [-]

I was being snarky, but I think steelmanning does have one major flaw.

By restating the argument in terms that are most convincing to you, you may already be warping the conclusions of your interlocutor to fit what you want them to be saying. Charity is, "I will assume this person is intelligent and overlook any mistakes in order to try and understand what they are actually communicating." Steelmanning is "I can make their case for them, better than they could."

Of course this is downstream of the core issue, and the reason why steelmanning was invented in the first place. Namely, charity breaks down on the internet. Steelmanning is the more individualistic version of charity. It is the responsibility of people as individuals, not a norm that can be enforced by an institution or community.

vintermann 7 days ago | parent | next [-]

One of the most annoying habits of Rationalists, and something that annoyed me with plenty of people online before Yudkowsky's brand was even a thing, is the assumption that they're much smarter than almost everyone else. If that is your true core belief, the one that will never be shaken, then of course you're not going to waste time trying to understand the nuances of the arguments of some pious medieval peasant.

Dylan16807 7 days ago | parent | prev [-]

For mistakes that aren't just nitpicks, for the most part you can't overlook them without something to fix them with. And ideally this fixing should be collaborative, figuring out if that actually is what they mean. It's definitely bad to think you simply know better or are better at arguing, but the opposite end of leaving seeming-mistakes alone doesn't lead to a good resolution either.

1attice 7 days ago | parent | prev [-]

Just so. I hate this term, and for essentially this reason, but it has undeniable currency right now; I was writing to be understood.

NoGravitas 7 days ago | parent | prev | next [-]

> While there are plenty of non-religious (and thus, small-r rationalist) alternatives to consequentialism, none of them seem to make it past the threshold in these communities.

I suspect this is because consequentialism is the only meta-ethical framework that has any leg to stand on other than "because I said so". That makes it very attractive. The problem is you also can't build anything useful on top of it, because if you try to quantify consequences, and do math on them, you end up with the Repugnant Conclusion or worse. And in practice - in Effective Altruism/Longtermism, for example - the use of arbitrarily big numbers lets you endorse the Very Repugnant Conclusion while patting yourself on the back for it.

rendx 7 days ago | parent | prev | next [-]

> to me, it seems like this is a cargo-cultish transposition of the act of finding _fallacies in arguments_ into the domain of finding _faults in persons_.

Well put, thanks!

morpheos137 7 days ago | parent | prev [-]

I am interested in your journey from philosophy to coding.

MajimasEyepatch 8 days ago | parent | prev | next [-]

I feel this way about some of the more extreme effective altruists. There is no room for uncertainty or recognition of the way that errors compound.

- "We should focus our charitable endeavors on the problems that are most impactful, like eradicating preventable diseases in poor countries." Cool, I'm on board.

- "I should do the job that makes the absolute most amount of money possible, like starting a crypto exchange, so that I can use my vast wealth in the most effective way." Maybe? If you like crypto, go for it, I guess, but I don't think that's the only way to live, and I'm not frankly willing to trust the infallibility and incorruptibility of these so-called geniuses.

- "There are many billions more people who will be born in the future than those people who are alive today. Therefore, we should focus on long-term problems over short-term ones because the long-term ones will affect far more people." Long-term problems are obviously important, but the further we get into the future, the less certain we can be about our projections. We're not even good at seeing five years into the future. We should have very little faith in some billionaire tech bro insisting that their projections about the 22nd century are correct (especially when those projections just so happen to show that the best thing you can do in the present is buy the products that said tech bro is selling).

xg15 8 days ago | parent | next [-]

The "longtermism" idea never made sense to me: So we should sacrifice the present to save the future. Alright. But then those future descendants would also have to sacrifice their present to save their future, etc. So by that logic, there could never be a time that was not full of misery. So then why do all of that stuff?

twic 8 days ago | parent | next [-]

At some point in the future, there won't be more people who will live in the future than live in the present, at which point you are allowed to improve conditions today. Of course, by that point the human race is nearly finished, but hey.

That said, if they really thought hard about this problem, they would have come to a different conclusion:

https://theconversation.com/solve-suffering-by-blowing-up-th...

xg15 8 days ago | parent | next [-]

Some time after we've colonized half the observable universe. Got it.

imtringued 7 days ago | parent [-]

Actually, you could make the case that the population won't grow over the next thousand years maybe even then thousand years, but that's the short term and therefore unimportant.

(I'm not a longtermist)

xg15 7 days ago | parent [-]

Not on earth, but my understanding was that space colonization was a big part of their plan.

8 days ago | parent | prev [-]
[deleted]
rawgabbit 8 days ago | parent | prev | next [-]

To me it is disguised way of saying the ends justify the means. Sure, we murder a few people today but think of the utopian paradise we are building for the future.

cogman10 7 days ago | parent [-]

From my observation, that "building the future" isn't something any of them are actually doing. Instead, the concept that "we might someday do something good with the wealth and power we accrue" seems to be the thought that allows the pillaging. It's a way to feel morally superior without actually doing anything morally superior.

Ma8ee 7 days ago | parent | prev | next [-]

A bit of longtermism wouldn’t be so bad. We could sacrifice the convenience of burning fossil fuels today for our descendants to have an inhabitable planet.

NoGravitas 7 days ago | parent [-]

But that's the great thing about Longtermism. As long as a catastrophe is not going to lead to human extinction or otherwise specifically prevent the Singularity, it's not an X-Risk that you need to be concerned about. So AI alignment is an X-Risk we need to work on, but global warming isn't, so we can keep burning as much fossil fuel as we want. In fact, we need to burn more of them in order to produce the Singularity. The misery of a few billion present/near-future people doesn't matter compared to the happiness of sextillions of future post-humans.

vharuck 8 days ago | parent | prev | next [-]

Zeno's poverty

to11mtm 7 days ago | parent | prev | next [-]

Well, there's a balance to be had. Do the most good you can while still being able to survive the rat race.

However, people are bad at that.

I'll give an interesting example.

Hybrid Cars. Modern proper HEVs[0] usually benefit to their owners, both by virtue of better fuel economy as well as in most cases being overall more reliable than a normal car.

And, they are better on CO2 emissions and lower our oil consumption.

And yet most carmakers as well as consumers have been very slow to adopt. On the consumer side we are finally to where we can have hybrid trucks that can get 36-40MPG capable of towing 4000 pounds or hauling over 1000 pounds in the bed [1] we have hybrid minivans capable of 35MPG for transporting groups of people, we have hybrid sedans getting 50+ and Small SUVs getting 35-40+MPG for people who need a more normal 'people' car. And while they are selling better it's insane that it took as long as it has to get here.

The main 'misery' you experience at that point, is that you're driving the same car as a lot of other people and it's not as exciting [2] as something with more power than most people know what to do with.

And hell, as they say in investing, sometimes the market can be irrational longer than you can stay solvent. E.x. was it truly worth it to Hydro-Quebec to sit on LiFePO patents the way they did vs just figuring out licensing terms that got them a little bit of money to then properly accelerate adoption of Hybrids/EVs/etc?

[0] - By this I mean Something like Toyota's HSD style setup used by Ford and Subaru, or Honda or Hyundai/Kia's setup where there's still a more normal transmission involved.

[1] - Ford advertises up to 1500 pounds, but I feel like the GVWR allows for a 25 pound driver at that point.

[2] - I feel like there's ways to make an exciting hybrid, but until there's a critical mass or Stellantis gets their act together, it won't happen...

riku_iki 6 days ago | parent | next [-]

> [2] - I feel like there's ways to make an exciting hybrid, but until there's a critical mass or Stellantis gets their act together, it won't happen...

many hybrid already way more exciting than regular ice, because they provide more torque, and many consumer buy hybrid because of this reason.

7 days ago | parent | prev | next [-]
[deleted]
BlueTemplar 7 days ago | parent | prev [-]

Not that these technologies don't have anything to bring, but any discussion that still presupposes that cars/trucks(/planes) (as we know them) still have a future is (mostly) a waste of time.

P.S.: The article mentions the "normal error-checking processes of society"... but what makes them so sure cults aren't part of them ?

It's not like society is particularly good about it either, immune from groupthink (see the issue above) - and who do you think is more likely to kick-start a strong enough alternative ?

(Or they are just sad about all the failures ? But it's questionable that the "process" can work (with all its vivacity) without the "failures"...)

vlowther 8 days ago | parent | prev | next [-]

"I came up with a step-by-step plan to achieve World Peace, and now I am on a government watchlist!"

NoGravitas 7 days ago | parent | prev [-]

It goes along with the "taking ideas seriously" part of [R]ationalism. They committed to the idea of maximizing expected quantifiable utility, and imagined scenarios with big enough numbers (of future population) that the probability of the big-number-future coming to pass didn't matter anymore. Normal people stop taking an idea seriously once it's clearly a fantasy, but [R]ationalists can't do that if the fantasy is both technically possible and involves big enough imagined numbers to overwhelm its probability, because of their commitment to "shut up and calculate"'

human_person 7 days ago | parent | prev | next [-]

"I should do the job that makes the absolute most amount of money possible, like starting a crypto exchange, so that I can use my vast wealth in the most effective way."

Has always really bothered me because it assumes that there are no negative impacts of the work you did to get the money. If you do a million dollars worth of damage to the world and earn 100k (or a billion dollars worth of damage to earn a million dollars), even if you spend all of the money you earned on making the world a better place, you arent even going to fix 10% of the damage you caused (and thats ignoring the fact that its usually easier/cheaper to break things than to fix them).

to11mtm 7 days ago | parent | next [-]

> If you do a million dollars worth of damage to the world and earn 100k (or a billion dollars worth of damage to earn a million dollars), even if you spend all of the money you earned on making the world a better place, you arent even going to fix 10% of the damage you caused (and thats ignoring the fact that its usually easier/cheaper to break things than to fix them).

You kinda summed up a lot of the world post industrial revolution there, at least as far as stuff like toxic waste (Superfund, anyone?) and stuff like climate change, I mean for goodness sake let's just think about TEL and how they knew Ethanol could work but it just wasn't 'patentable'. [0] Or the "We don't even know the dollar amount because we don't have a workable solution" problem of PFAS.

[0] - I still find it shameful that a university is named after the man who enabled this to happen.

Nursie 7 days ago | parent | prev | next [-]

And not just that, but the very fact that someone considers it valid to try to accumulate billions of dollars so they can have an outsized influence on the direction of society, seems somewhat questionable.

Even with 'good' intentions, there is the implied statement that your ideas are better than everyone else's and so should be pushed like that. The whole thing is a self-satisfied ego-trip.

throawaywpg 7 days ago | parent [-]

Well, it's easy to do good. Or, its easy to plan on doing good, once your multi-decade plan to become a billionaire comes to fruition.

vintermann 6 days ago | parent | prev [-]

There's a hidden (or not so hidden) assumption in the EA's "calculations" that capitalism is great and climate change isn't a big deal. (You pretty much have to believe the latter to believe the former).

7 days ago | parent | prev [-]
[deleted]
DoctorOetker 7 days ago | parent | prev | next [-]

Would you consider the formal verification community to be "rationalists"?

eru 6 days ago | parent | prev | next [-]

> Not that non-rationalists are any better at reasoning, but non-rationalists do at least benefit from some intellectual humility.

I have observed no such correlation of intellectual humility.

kergonath 8 days ago | parent | prev | next [-]

> I don’t think it’s just (or even particularly) bad axioms, I think it’s that people tend to build up “logical” conclusions where they think each step is a watertight necessity that follows inevitably from its antecedents, but actually each step is a little bit leaky, leading to runaway growth in false confidence.

I really like your way of putting it. It’s a fundamental fallacy to assume certainty when trying to predict the future. Because, as you say, uncertainty compounds over time, all prediction models are chaotic. It’s usually associated with some form of Dunning-Kruger, where people know just enough to have ideas but not enough to understand where they might fail (thus vastly underestimating uncertainty at each step), or just lacking imagination.

ramenbytes 7 days ago | parent [-]

Deep Space 9 had an episode dealing with something similar. Superintelligent beings determine that a situation is hopeless and act accordingly. The normal beings take issue with the actions of the Superintelligents. The normal beings turn out to be right.

emmelaich 7 days ago | parent | prev | next [-]

Precisely! I'd even say they get intoxicated with their own braininess. The expression that comes to mind is to get "way out over your skis".

I'd go even further and say most of the world's evils are caused by people with theories that are contrary to evidence. I'd place Marx among these but there's no shortage of examples.

abtinf 8 days ago | parent | prev [-]

> non-rationalists do at least benefit from some intellectual humility

The Islamists who took out the World Trade Center don’t strike me as particularly intellectually humble.

If you reject reason, you are only left with force.

prisenco 8 days ago | parent | next [-]

Are you so sure the 9/11 hijackers rejected reason?

Why Are So Many Terrorists Engineers?

https://archive.is/XA4zb

Self-described rationalists can and often do rationalize acts and beliefs that seem baldly irrational to others.

cogman10 7 days ago | parent [-]

Here's the thing, the goals of the terrorists weren't irrational.

People confuse "rational" with "moral". Those aren't the same thing. You can perfectly rationally do something that is immoral with a bad goal.

For example, if you value your life above all others, then it would be perfectly rational to slaughter an orphanage if a more powerful entity made that your only choice for survival. Morally bad, rationally correct.

vintermann 6 days ago | parent [-]

Yes, there's no such thing as rationality except rationality towards a goal.

But Big R Rationalists assume that if we were rational enough (in an exotic, goal-independent way nebulously called intelligence), we'd all agree on the goals.

So basically there is no morality. No right or wrong, only smart or stupid, and guess who they think are the smart ones.

And this isn't an original philosophy at all. Plato certainly believed it (and if you believe Plato, Socrates too). Norse pagans believed it. And everyone who believes it seem to sink into mystery religion, where you can get access to the secret wisdom if you talk to the right guy who's in the know.

morleytj 8 days ago | parent | prev | next [-]

I now feel the need to comment that this thread does illustrate an issue I have with the naming of the philosophical/internet community of rationalism.

One can very clearly be a rational individual or an individual who practices reason and not associate with the internet community of rationalism. The median member of the group defined as "not being part of the internet-organized movement of rationalism and not reading lesswrong posts" is not "religious extremist striking the world trade center and committing an atrocious act of terrorism", it's "random person on the street."

And to preempt a specific response some may make to this, yes, the thread here is talking about rationalism as discussed in the blog post above as organized around Yudowsky or slate star codex, and not the rationalist movement of like, Spinoza and company. Very different things philosophically.

montefischer 8 days ago | parent | prev [-]

Islamic fundamentalism and cult rationalism are both involved in a “total commitment”, “all or nothing” type of thinking. The former is totally committed to a particular literal reading of scripture, the latter, to logical deduction from a set of chosen premises. Both modes of thinking have produced violent outcomes in the past.

Skepticism, in which no premise or truth claim is regarded as above dispute (or, that it is always permissible and even praiseworthy to suspend one’s judgment on a matter), is the better comparison with rationalism-fundamentalism. It is interesting that skepticism today is often associated with agnostic or atheist religious beliefs, but I consider many religious thinkers in history to have been skeptics par excellence when judged by the standard of their own time. E.g. William Ockham (of Ockham’s razor) was a 14C Franciscan friar (and a fascinating figure) who denied papal infallibility. I count Martin Luther as belonging to the history of skepticism as well, for example, as well as much of the humanist movement that returned to the original Greek sources for the Bible, from the Latin Vulgate translation by Jerome.

The history of ideas is fun to read about. I am hardly an expert, but you may be interested by the history of Aristotelian rationalism, which gained prominence in the medieval west largely through the works of Averroes, a 12C Muslim philosopher who heavily favored Aristotle. In 13C, Thomas Aquinus wrote a definitive Catholic systematic theology, rejecting Averroes but embracing Aristotle. To this day, Catholic theology is still essentially Aristotelian.

praptak 7 days ago | parent | next [-]

True skepticism is rare. It's easy to be skeptical only about beliefs you dislike or at least don't care about. It's hard to approach the 100th self-professed psychic with an honest intention to truly test their claims rather than to find the easiest way to ridicule them.

throwway120385 7 days ago | parent | prev [-]

The only absolute above questioning is that there are no absolutes.