Remix.run Logo
sixo 5 days ago

It's not even worth mentioning this problem unless you talk about how the result depends on the data generating process. If you take it to be something like "you randomly sample from families with two children, discarding any without at least one girl", you get the 1/3 result, but there are various other ways to read a sampling process from the problem statement which lead to other results.

pontus 4 days ago | parent | next [-]

Just to pile on here, there's also ambiguity around how the observed girl is selected. Consider the following framing:

I go to a random house on a random street and knock on the door. A young girl opens the door. I ask how many siblings they have and they say one. What's the probability that they have a sister?

Now it's 50% even though cosmetically it seems like it'd be fair to say that the family has at least one daughter. The reason is that once I see a girl at the door, I'm slightly more confident in that it's a GG household since a GB or BG household would sometimes show a boy opening the door (assuming the two kids are equally likely to open the door).

P(GG | G at door) = P(G at door | GG) P(GG) / P(G at door)

P(G at door) = 1/2 (by symmetry)

So, P(GG | G at door) = 1 * 1/4 * 2 = 1/2

MontyCarloHall 4 days ago | parent | next [-]

This is the crux of the "paradox," which is really just an interpretation problem. Most people assume that the question asks exactly your scenario, i.e. if a specific child is selected and it's a girl, what's the probability that the sibling is also a girl? In that case, the event space is just GB or GG, and p(GG)/(p(GB) + p(GG)) = 0.5. (BG is not in the event space because we are conditioning on a specific child being a girl.)

However, if the question is interpreted as "what's the probability of having two girls if we know there aren't two boys," then the event space is GB, BG, GG, and p(GG)/(p(GB) + p(BG) + p(GG)) = 1/3. Both GB and BG are in the event space because we are not conditioning on the sex of one specific child.

smohare 4 days ago | parent | prev [-]

You’re making the classic mistake in conflating computing how pathways to conditions can rise from computing conditionals given the current state. There’s absolutely no information theoretical difference between you saying “A girl opened the door” and “I was told the family has a girl.”

Look at the more technical descriptions using conditional probabilities of the Monty Hall problem as it is essentially equivalent. You’re trying to factor in the probability of whether Monty knows if a goat is behind a door when the observable information is that there is an open door with a goat. One you make that observation many things collapse.

the_gipsy 4 days ago | parent | prev | next [-]

Why can you not frame it as: "a random family has been sampled, the sample family has two childs, one of them is a girl"?

I.e. without "discarding", just giving some additional, but not complete, information on the random sample. Is adding information about the picked sample the same as discarding all contrarian samples? Why is this relevant?

AnotherGoodName 4 days ago | parent | next [-]

If there were two possible statements they asked

"a random family has been sampled, the sample family has two childs, one of them is a girl"?

and

"a random family has been sampled, the sample family has two childs, one of them is a boy"?

and they selected each statement based on randomly picking a child from a random family then the probability actually becomes 50% boy/girl for the next child since the boy/boy or girl/girl has twice the chance of generating the above statement for the respective gender compared to the mixed gender children family.

Ie. if they say one is a girl that statement had a 50% chance of being generated by a girl/girl family (since we pick the statement based on a random selection of one of the two childrens gender and there's 2 girls, doubling the chance of a statement that one's a girl coming from a girl/girl family), there's 25% chance the statement was generated from a girl/boy family and a 25% chance the statement was generated from a boy/girl family.

If you take 50% chance girl/girl, 25% chance boy/girl and 25% girl/boy you'll see there's a 50/50 chance of the next child being either gender.

All this due to changing how we sampled.

florbnit 4 days ago | parent | prev | next [-]

> a random family has been sampled, the sample family has two childs, one of them is a girl"

It’s not a random family if it must have at least one girl. If you want to talk about a random family you can only make statements of the kind “one of the children is <gender>” where the gender depend on the specific family or “the family has between 0 and 2 girls”

markoknoebl 4 days ago | parent | prev [-]

[dead]

ndr 4 days ago | parent | prev | next [-]

I took this to mean exactly that:

> Assume the family is selected at random because they have at least one girl.

And then again, if they sampled all families with 2 children the posterior would not change, would it?

Still assuming boy vs girls are completely iid and equally probable

renewiltord 4 days ago | parent | prev | next [-]

Indeed. One thing they haven't mentioned is that the mother wasn't Zharata The Man Hater, who would kill any boy child. Therefore, in the Zharata case the answer is 1, and we're missing the probability of Zharata's family being considered, which could be one of pure certainty since she always puts her family forward for any puzzle question - killing any philosopher who would pose one not relating to her own family.

two_handfuls 4 days ago | parent | prev | next [-]

That's how I read it. What other ways were you thinking about?

bloak 4 days ago | parent [-]

Well, one way of getting families with two children, at least one of which is a girl, would be to go to a girls' school and ask the children to raise their hand if they have exactly one sibling.

aidenn0 4 days ago | parent [-]

I would expect that would yield a 50% chance of the other being a girl, right?

krackers 3 days ago | parent | prev | next [-]

Right, you have to very clearly define how you do the sampling. Here's some possible cases:

Assume all houses have 2 children, and each child has equal probability of being either a boy or a girl. It will help to treat children as distinguishable, e.g. by eldest vs youngest.

* You randomly choose a house, and a random child answers the phone. You question the person who answers the phone about his/her gender and that of his/her sibling. You repeat this experiment. Given that the person who answered the phone was a girl, the conditional probably she has a sister is 1/2. Note that this is not bertrand’s box, as the 1 girl house can occur as either eldest vs youngest girl (unlike Bertrand box where only 1 box has exactly 1 gold) so they cancel out and so a girl you spoke with is equally likely to have been from either a 2 girl or 1 girl house.

* You categorize houses into 2 girl, 1 girl, and 0 girl houses. You randomly pick a category, then pick a house from that category, then phone them. As before a random child answers, and you question them. given that the person who answered the phone was a girl, the conditional probably she has a sister is 2/3. This is bertrand’s box, you’re more likely to have spoken with a girl from from a 2 girl house than a 1 girl house. Explicitly grouping by # of girls first before sampling breaks the previous symmetry.

* You randomly choose a house, and ask for the eldest child. You question him/her. You repeat this experiment. Given that the person whom you spoke with was a girl, the probability she has a sister is 1/2. Nothing new here, as seen in case (1) you were already equally likely to speak with a girl from a 2 girl vs 1 girl house, so asking for the eldest person (which by symmetry is equally likely to be a boy or a girl) doesn’t change anything here.

* You categorize houses into 2 girl, 1 girl, and 0 girl houses. You randomly pick a category, then pick a house from that category, then phone them and ask for the eldest child. Given that the person who answered the phone was a girl, the conditional probably she has a sister is 1/2. Explicitly choosing the eldest disrupts the asymmetry in bertrand’s box: since every house has only 1 eldest which is the one you speak with, being from a two girl house no longer makes that girl more likely to have spoken with the caller.

* You randomly choose a house, and a random child answers the phone. You question the person who answers the phone. You repeat this experiment. Given that EITHER the child you spoke with OR their sibling is a girl, the probability you spoke with someone from a 2 girl house is 1/3. It might seem counterintuitive at first that loosening the criteria _reduces_ the probability of speaking with someone from a 2 girl house. But this makes sense, since there’s still only 2 ways you can speak with someone from a 2 girl house (either the eldest or youngest sister), but now 4 ways you can speak with someone from a 1 girl house, since you’re allowed to speak with the boys of that house as well.

* You randomly choose a house, and ask for the eldest child, and question them. Given that EITHER the child you spoke with OR their sibling is a girl, the probability you spoke with someone from a 2 girl house is again 1/3. Explicitly speaking with the eldest doesn’t make a difference here because we’re already conditioning on either the eldest or youngest being a girl.

aaron695 4 days ago | parent | prev [-]

[dead]