Remix.run Logo
hatthew a day ago

We have to make a distinction between "expected information gain" vs "maximum information gain". An answer of "yes" generally gains >1 bit, but an answer of "no" generally gains <1 bit, and the average outcome ends up <1. It is impossible for a single yes/no question to have an expected gain of >1; the maximum possible is precisely 1.

tobyjsullivan a day ago | parent [-]

The total probabilities add up to 1. But I’m not following how that relates to the average bits.

Despite summing to 1, the exact values of P(true) and P(false) are dependent on the options which have previously been discounted. Then those variables get multiplied by the amount of information gained by either answer.

adastra22 a day ago | parent | next [-]

It is definitional, which I mean in the strictest mathematical sense: the information content of a result is directly derived from how “unexpected” it is.

A result which conveys 2 bits of information should occur with 25% expected probability. Because that’s what we mean by “two bits of information.”

hatthew a day ago | parent | prev | next [-]

The article states "Suppose we have calculated the expected information gained by potential truth booths like below: Expected information: 1.60 bits ..." This is impossible because of the general fact in information theory that (p(true) * bits_if_true) + (p(false) * bits_if_false) <= 1. If they had said "Suppose we have calculated the maximum information gained...", then 1.6 bits would be valid. They said "expected information" though, so 1.6 bits is invalid.

thaumasiotes a day ago | parent | prev [-]

So, you have n options, you ask a question, and now you're down to m options.

The number of bits of information you gained is -log₂ (m/n).

If you ask a question which always eliminates half of the options, you will always gain -log₂ (1/2) = 1 bit of information.

If you go with the dumber approach of taking a moonshot, you can potentially gain more than that, but in expectation you'll gain less.

If your question is a 25-75 split, you have a 25% chance of gaining -log₂ (1/4) = 2 bits, and a 75% chance of gaining -log₂ (3/4) = 0.415 bits. On average, this strategy will gain you (0.25)(2) + (0.75)(0.415) = 0.8113 bits, which is less than 1 bit.

The farther away you get from 50-50, the more bits you can potentially gain in a single question, but - also - the lower the number of bits you expect to gain becomes. You can never do better than an expectation of 1 bit for a trial with 2 outcomes.

(All of this is in the article; see footnote 3 and its associated paragraph.)

The article explicitly calls out the expectational maximum of one bit:

>> You'll also notice that you're never presented with a question that gives you more than 1 expected information, which is backed up by the above graph never going higher than 1.

So it's strange that it then goes on to list an example of a hypothetical (undescribed, since the scenario is impossible) yes/no question with an expected gain of 1.6 bits.