Remix.run Logo
JeffJor 4 days ago

Q1: "A family has two children. You're told that at least one of them is a girl. What's the probability both are girls?"

Q2: "A family has two children. You're told that at least one of them is a boy. What's the probability both are boys?"

Note that these are symmetric problems, and must have the same answer.

Q3: "A family has two children. You're told that a gender, that applies to at least one, is written inside a sealed envelope. What's the probability both have that gender?"

In Q3, we have no information. So the answer is the proportion of two-child families that are single gendered. That is, 1/2.

But if we open the envelope, and read what is written inside, the problem becomes either Q1 or Q2. Which have the same answer. So we don't have to open it; whatever the answer to Q1 and Q2 is, opening the envelope in Q3 make its answer the same. If that answer is 1/3, we have a paradox. The answer has to be 1/2 of we don't look.

This is what is known as "Bertrand's Box Paradox." Well, if we add a fourth box to his problem, with one gold and one silver coin. I realize that in modern times the problem itself is called the paradox, but what Bertrand actually wrote (edited to this problem) was "How can it be that opening the envelope suffices to change the probability from 1/2 to 1/3?"

The resolution is that probability must be based on the full set of possibilities, not the possibilities that _could_ result from the full set of _states._ These are the possibilities for this problem:

1) BB and you are told that there is at least one boy. 2A) BG and you are told that there is at least one boy. 2B) BG and you are told that there is at least one girl. 3A) GB and you are told that there is at least one boy. 3B) GB and you are told that there is at least one girl. 4) GG and you are told that there is at least one girl.

Each numbered case has a prior probability of 1/4. Let's say the "A" subcases have a probability of Q/4, so the "B" subcases have a probability of (1-Q)/4.

The answer to the first problem is the probability of case 1, which is 1/4, divided by the total probability of cases 1, 2A, which is (1+2Q)/4. That's 1/(1+2Q).

The answer to the second problem is the probability of case 4, divided by the total probability of cases 4, 2B, and 3B. Which is (3-2Q)/4.

Bertrand's paradox, stated another way, is that these must be equal, but can only be equal if Q=1/2 and both answers are 1/2.

Majromax 4 days ago | parent | next [-]

In all of these questions, you're making an assumption about the data-generating process. In Q1 and Q2, you're assuming that you had a 0% chance (a priori) of hearing that 'neither is a (girl/boy)', and in Q3 you're assuming that there's a 0% chance of hearing that the envelope doesn't match the family.

Take a look at this problem beginning with no assumptions. We have two kids, and an envelope that contains 'B' or 'G'. Our probability space is (B,G)^3, with each having probability of 1/8.

Now, we add information about the match as conditioning. Conditional on being told that the envelope matches the family, we can exclude the BBG and GGB cases. That brings us down to 6, of which we have BBB, GGG, and (BG,GB)(B,G). With this additional information, the probability of matching genders becomes 1/3. This probability is still 1/3 if we open the envelope to find B or G, since we exclude all three cases where the envelope doesn't match our observation of it.

In my view, this is related to the Monty Hall problem; we have to realize that we're given additional information with the statement/envelope.

maest 4 days ago | parent | next [-]

> in Q3 you're assuming that there's a 0% chance of hearing that the envelope doesn't match the family.

This is equivalent to the host never opening the door with the car in the Monty Hall scenario

JeffJor a day ago | parent | prev [-]

I am not making an assumption about the data-generating process in any of these questions. The only “assumptions” I make are that the information is true (so yes, the envelope in Q3 matches the family), that the information (whether or not it is sealed) is about one gender, and that information WILL NOT be sufficient to determine the complete makeup of the family. Suggesting otherwise is usually a sign that you have reached your answer first, and are trying to justify it by assertion.

Your error is that you seem to be deciding the answer is 1/3 first, forcing you to assume whatever makes that so. You literally said that when you said the probability must become 1/3.

I do look at this problem from the beginning with no assumptions. If you want to be pedantic, the probability space comprises the sample space (the set of possible outcomes), an event space (a set of subsets of the sample space with certain properties), and a probability function Pr(*) that maps each event in the event space to a number in [0,1]. The complete sample space is {BBb, BBg, BGb, BGg, GBb, GBg, GGb, GGg}, which I assume is what you want (B,G)^3 to mean. We then need the events {BBb, BBg}, {BGb, BGg}, {GBb, GBg}, and {GGb, GGg} to all have probability 1/4.

Since we make no assumptions about the how the lower-case letter is “generated” other than IT MUST MATCH ONE OF THE UPPER CASE ONES, we get that Pr({BBb} = Pr(GGg} = 1/4 and Pr({BBg}) = Pr({GGb}) = 0. What we need to determine is how we get Pr({BGb})+Pr({BGg}) = Pr({GBb})+Pr({GBg}) = 1/4.

In order to make the answer become 1/3 in Q1 and Q2, as you assert, we must assume that we do know how the lower-case letter is generated. In Q1 and Q2 we must assume it is an answer to “is there a girl/boy.” This makes one half of each pair 1/4, and the other 0. If we make no assumption, then the Principle of Indifference (literally, that we make no assumptions to distinguish functionally equivalent outcomes) says Pr({BGb}) = Pr({BGg} = Pr({GBb}) = Pr({GBg}} = 1/8. This makes one answer:

A1 = Pr({GGg}) / [Pr({BGg}) + Pr({GBg}) + Pr({GGg})] = (1/4) / [1/8 + 1/8 + 1/4] = 1/2

Yes, this is a variation of the Monty Hall Problem. Most "solutions" to it are really just explanations for how it can make sense. The mathematical solution follows the outline I used above. It is based on the probability, if the door you choose has the prize (compare to a mixed-gender family), that Monty will open door X or door Y (i.e., the other two) as determined PRIOR TO it happening. If you assume it is 100% for the door he did open, which is only determined AFTER HAVING SEEN IT, like you want to assume in Q1 and Q2 that only a girl/boy can be mentioned, then the answer is that switching does not matter. It is only if you use the Principle of Indifference – meaning each has a 50% chance – that the answer is that switching wins 2/3 of the time.

meatmanek 4 days ago | parent | prev [-]

In Q3, you've got 8 possibilities, expressed as (gender of 1st child, gender of 2nd child, which child's gender is written inside the sealed envelope?), each with presumably equal probability:

   1. B B 1
   2. B B 2
   3. B G 1
   4. B G 2
   5. G B 1
   6. G B 2
   7. G G 1
   8. G G 2

in which case 4 of 8 possibilities satisfy the condition (the first two and the last two).

Once you open the sealed envelope and it says "girl", it does not become Q1, it becomes a different question:

Q4: "A family has two children. I randomly sampled one of the children and it was a girl. What's the probability both are girls?"

In which case, we're looking at possibilities 4, 5, 7, and 8, and in only 2 of those 4 possibilities are both children girls.

In Q1, you're actually told "A family has two children. I looked at both children and can tell you that at least one of them is a girl. What's the probability that both are girls?". In which case, possibilities 3, 4, 5, 6, 7, 8 are all valid. Only in 2 of those 6 possibilities are both children girls.

So as in_cahoots said in https://news.ycombinator.com/item?id=45053187, it matters whether the person asking looked at both children or just a single one.