Monday, November 20, 2017

The Tuesday Child Puzzle via Monte Carlo (repost)

I get more than a few hits for this old post, so here is a rerun.



There has been a lot of chatter about the Tuesday Child puzzle:
I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?
Everyone's instinct is to say: Tuesday has nothing to do with it, so the answer is simply 1/2.

That answer is wrong. Here is the trouble with everyone's intuition: they read the problem as: I have one son born on a Tuesday, what is the probability that my next child will be a son? There the answer is 1/2. But that is not the problem at hand--here both children are already born. It's a different problem

The answer is actually 13/27 = 0.481, not 1/2.

Let's accept that for now. If you Google you'll find a lot of proofs, some with tables and some using complicated Bayesian analysis. I'll get the result later, via simulation, but for now we'll assume it is correct. But the way to think about it is this: there are lots of ways that two children can be born on any of seven days, say Boy on Tuesday and Girl on Saturday, all with equal probability, and exactly 1/4 of them have two boys. But as I place constraints, such as Boy on Tuesday, many of the possibilities are eliminated and the probabilities change. For example, I can place a very tight constraint: I have two children, one is a son born Tuesday and the other is a son born Saturday. What is the probability I have two sons? Why it is one of course, because all arrangements except Boy on Tuesday and Boy on Saturday have been eliminated by the constraints.

Of course Tuesday really has nothing to do with it. What is relevant is that a boy is constrained to be born on one specific day—any day would do--this could just as easily be the Friday Child Puzzle. It is the limitation that one of the children is a boy born on one specific day out of seven that is relevant.

To see this, assume you asked:
I have two children. One is a boy born on a Monday, Tuesday, Wednesday, Thursday, Friday, Saturday or Sunday. What is the probability I have two boys?
Clearly this is the same as simply asking: I have two children, one is a boy, what is the probability that the other is a boy? Here the answer is clear—the possible arrangements given that we have at least one boy are: BB, BG, GB. They occur with equal probability, so the probability of BB (two boys) is 1/3.

We could generalize the puzzle this way:

I have two children. One is a boy born no later than day N where N is 1..7. What is the probability I have two boys?

Let's call that probability P(N).

So the original form of the puzzle is: what is P(1)? The answer we are accepting (for now) is not 1/2 but 13/27.

The second form of the puzzle, where the constraint is a boy born on any of the seven days, could be stated this way: what is P(7)? That we just demonstrated is 1/3.

What is the meaning of P(0)? This would mean that the first boy wasn't born on any day. This pathological case becomes, simply, what is the probability that the other child is a boy, which is our beloved 1/2.

Let's make a prediction. We have:

P(0) = 1/2 = 14/28
P(1) = 13/27
P(2) = 12/26
P(3) = 11/25
P(4) = 10/24
P(5) = 9/23
P(6) = 8/22

P(7) = 7/21 = 1/3

The red values P(2) through P(6) come from a prediction based on an obvious pattern. Let us remember what this means. P(1) is the probability of two boys given that one son is born on one specific day, say Tuesday. P(2) is the probability given that one son is born on one of two days, say Tuesday or Wednesday. Etc., etc., etc.

I wrote a Monte Carlo simulation for this problem and the results are:

Days   Prob of 2 boys
-------------------
 P(1)       0.4811
 
P(2)       0.4615
 
P(3)       0.4399
 
P(4)       0.4167
 
P(5)       0.3911
 
P(6)       0.3631
 
P(7)       0.3336

These are good approximations to the predicted fractions above. The zero row is not in the output because it is a pathological case and the code won't handle it.

Note that P(1), as advertised, is 13/27.

The probability (that the second child is a boy) drops smoothly from P(0) = 1/2 (when in effect there is no first child) to 1/3 when the first child is a boy, born any day.

The JAVA program is given below. 1

public class TuesdayChild {

//constant defining a boy baby
   private static final int BOY = 0;

//method that randomly selects a sex
   private static int randomSex() {
     return (int)(Integer.MAX_VALUE*Math.random()) % 2;
   }

//method that randomly selects a day, 0--6
  private static int randomDay() {
    return (int)(Integer.MAX_VALUE*Math.random()) % 7;
  }

//main method
   public static void main(String arg[]) {
     TuesdayChild tchild = new TuesdayChild();

//how many trials per iteration
     int numTrial = 10000000;

//hold the results for each case. We will vary the
//number of days one boy is constrained to be born on
//from 1 (corresponding to the problem as stated) to 7

     double results[] = new double[7];

//loop over the constraint days 
    for (int numDays = 1; numDays <= 7; numDays++) { 
      int passCount = 0; //number of trials passing constraint
      int twoBoyCount = 0; //subset that have two boys

//now loop over the trials
     for (int i = 0; i < numTrial; i++) { 
      Trial trial = tchild.new Trial(numDays); 
      if (trial.keepTrial()) { 
        passCount++; 
        if (trial.twoBoys()) {
          twoBoyCount++; 
        } 
      } 
    } 
    results[numDays-1] = ((double)twoBoyCount)/passCount; 
  } 
//now print the results
   System.out.println("Days Prob 2 boys");
   System.out.println("-------------------");
   for (int numDays = 1; numDays <= 7; numDays++) {
      System.out.println(String.format("%d %8.4f", numDays, results[numDays-1])); 
    } 
  } 

//class for a single trial
   class Trial {
//sex of child 1 and child 2
   int sex1 = randomSex();
   int sex2 = randomSex();

//day of birth child 1 and child 2
   int day1 = randomDay();
   int day2 = randomDay();

//this will determine how many days we constrain the birth
//of one boy. It can be 1--7. At one, one boy will be constrained
//to be born on one day, such as Tuesday. This is analogous to the
//puzzle as stated. If it is two then one boy is constrained to be
//born on two days, say Tuesday or Wednesday. If it is seven, one boy
//is constrained to be born on any day. This is the same as simply saying
//you have one boy. The answer for that case should be 1/3.

   int _max;

   public Trial(int max) {
     _max = max;
  }

//see if we keep this trial because at least one of the two children
//was a boy born within the constrained number of days

   public boolean keepTrial() {
     return ((sex1==BOY)&&(day1 < _max)) || (sex2==BOY)&&((day2 < _max)); } 

//see if this trial has two boys
   public boolean twoBoys() {
     return (sex1 == BOY) && (sex2 == BOY);
     }
   }
}


UPDATE: 

Here is a tabular solution. We'll use the first letters of each day, except with R for Thursday and Q for Sunday. And we also use first child - second child ordering, for example BTGmeans the first child was a boy born on Tuesday and the second was a girl born on Monday. We can easily enumerate all the possibilities with a boy on Tuesday:

BTBT
BTBQ  BTBM  BTBW  BTBR  BTBF   BTBS          
BQBT  BMBT  BWBT  BRBT  BFBT  BSBT 
BTGQ  BTGM  BTGT  BTGW  BTGR  BTGF  BTGS
GQBT  GMBT  GTBT  GWBT  GRBT  GFBT  GSBT

Each of these occurs with equal probability. 3 There are 27 total. Those with two boys are shown in red. There are 13 of those. So the answer, again, is 13/27.

This problem would be quite different is it were stated:

I have two children. The first (oldest) was a boy born on a Tuesday. What is the probability I have two boys?
Now the enumeration is

BTBT
BTBQ  BTBM  BTBW  BTBR  BTBF   BTBS          
BTGQ  BTGM  BTGT  BTGW  BTGR  BTGF  BTGS


And we get the intuitive result of 7/14, or 1/2.

1 God programs in JAVA, and uses the blessed 2  K&R brace style, just as surely as Jesus speaks in early 17th century English. However, His use of comments and exception handling is  purely anthropomorphic.
2 This is one of those special times when "blessed" is a two syllable word.
3 In physics terms the macrostate is "we have two children and at least one is a son born on Tuesday" and there are 27 corresponding microstates. So the entropy of our macrostate is S = k ln(27), where k is the Boltzmann  constant.

6 comments:

  1. In 1889, Joseph Bertrand published a cautionary tale about how *NOT* to solve a probability problem like this. There are three boxes, each with two coins: one has two of gold, one has two of silver, and the last box has one of each kind. Pick a box at random. What is the probability that it has two of the same kind of coin? (This is supposed to be easy: 2/3.)

    Now suppose I open the chosen box, pull out a coin, and show you that it is gold. Does this affect the answer? Many people - including you from your answer above - would say it does. The box can't be the one with two silver coins, but could be either of the other two. Since only one of these boxes has two of the same kind of coin, the answer appears to change to 1/2.

    Bertrand then changed the second problem slightly. Suppose I pull a coin out, but don't show it to you. Does the answer still change? If the coin I pulled out is gold, the problem is the same as the one where it seemed to change to 1/2. If it is silver, similar logic says it is also changes to 1/2. Since these are the only two possibilities, and the answer is 1/2 regardless of which, the answer to this question is also 1/2. (Note: a more formal solution uses conditional probability and the Law of Total Probability to get this result.)

    But that is a paradox: You have no information that would allow you to update the probability from 2/3 to 1/2. (Note: the name "Bertrand's Box Paradox" does not mean the problem itself, it means this paradox). So what seems to be true about the probability change, can't be.

    Bertrand's point was that you cannot assume you would get the information you have in every case where it is true. The correct solution to recognizes that the probability I would pull a gold coin from a box with two gold coins is 100%, but it is 50% if the box has one of each. Using these values, the answer to the second question is 2/3, just like the first question, and there is no paradox.

    In 1959, Martin Gardner published the predecessor to your problem in Scientific American: "Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?" He originally said the answer was 1/3, BUT HE RETRACTED THAT ANSWER for the same reason found in Bertrand's Box Paradox. The answer could be 1/3 or 1/2, depending on how you learned the information. Since, unlike Bertrand, he gave no information about this, he said the problem was ambiguous. In my opinion, he is wrong - the ambiguity means you can only assume it was randomly obtained, and the answer is 1/2. But that isn't really important.

    In 2010, at a puzzle convention held in honor of Martin Gardner, Gary Foshee presented the problem you give above. It is less ambiguous than Gardner's earlier one, since Foshee told us that he picked one description from the (most likely) two that describe his children. And he DIShonored Gardner by overlooking how Gardner changed his answer.

    The answer to your question is 1/2. There is a 1/196 chance that you have two Tuesday Boys, and a 100% chance you would tell us about a Tuesday Boy if you do. There is a 12/196 chance that you have a Tuesday Boy and a non-Tuesday Boy, and a 14/196 chance that you have a Tuesday Boy and a girl; but in either case, only a 50% chance you would tell us about the Tuesday Boy. Ignoring the "196" since it divides out of all terms, the answer is (1+12/2)/(1+12/2+14/2)=(1+6)/(1+6+7)=1/2.

    ReplyDelete
  2. JeffJo,

    Thanks for the comment.

    I disagree, I think the problem as stated is unambiguous. For a simple analysis, see Jason Rosenhouse (a mathematician who wrote a book on the Monty Hall problem).

    http://scienceblogs.com/evolutionblog/2011/11/08/the-tuesday-birthday-problem/

    I have update the post with a tabular based discussion.

    Out of curiosity, where do think the simulation, which confirms the 13/27 answer, went wrong?

    ReplyDelete
  3. As proved by the Bertrand Box Paradox - and btw, I got my interpretation of it from Jason Rosenhouse's book starting on page 24 - having two children including a boy who was born on a Tuesday, and telling somebody that you have two children including a boy who was born on a Tuesday, ARE NOT THE SAME EVENT.

    I can't stress that enough. You are equating them in your solution, but your problem statement does not.

    As Rosenhouse explains, you are making the exact same mistake as the people who think the answer to the Monty Hall Problem is 1/2. They think that because Door #3 has a goat, that they should "eliminate" all possibilities where it has the car, and "keep" all where it has a goat. In EXACTLY THE SAME MANNER, you suggest that we should "eliminate" all possibilities where you don't have a Tuesday boy, and "keep" all where you do.

    Your table, and your simulation, are wrong because they simply count the cases, "keeping" those with a Tuesday Boy and "eliminating" those without one. Make a similar table, or simulation, for Monty Hall, and the answer you get will be 1/2.

    If I had asked you "does either of your two children happen to be a boy who was born on a Tuesday?", it would be correct to "keep" all cases where you have a Tuesday Boy. Just like, on the game show, if we had asked Monty Hall to open door #3 for us, 1/2 would be correct.

    But when you have the freedom to mention a Girl born on a Thursday even if you her brother was born on a Tuesday, then we can't simply count cases. We can only consider both descriptions as being equally likely to be passed along.n Just like the game show contestant must consider the possibility that door #2 could have been opened.

    ReplyDelete
    Replies
    1. Are you saying that these three formulations of the problem are no equivalent?

      1) Mr. Smith has two children, and one of them is a boy, but also that the boy was born on Tuesday (Wikipedia)

      2) You meet a man on the street and he says, “I have two children and one is a son born on a Tuesday.” What is the probability that the other child is also a son? (Jason Rosenhouse)

      3) I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?

      All three claim the same answer of 13/27. I assume you think at least Jason is correct, and probably wikipedia, so how do you view my wording as significant? Is not my wording the same as Jason's, except I am the man on the street?

      This is interesting, I actually hope I am wrong because the problem will be even more fascinating.

      Delete
  4. There are a couple of horrible wording choices in these versions. I'll get there, but I want to add a fourth one:

    4) You meet a man on the street and he says, “I have two children." You ask "Is at least one of them a son born on a Tuesday?” He says yes. What is the probability that both are boys?

    The answer to #2 and #3 is 1/2. I have explained why: You can't conclude that a person with a Tuesday Boy, and another child who is not a Tuesday Boy, would always tell you about the Tuesday Boy and never tell you about the other child. You can conclude this in #4, which is why its answer is 13/27.

    As Martin Gardner pointed out, #1 is ambiguous and could be 13/27 or 1/2. But the **ONLY** scenario that would make it 13/27 is version #4, which is completely unreasonable from the wording of #1.

    +++++

    Horrible wording choice #1: If "one is XXX" can be interpreted as "at least one, but maybe two, are XXX", then "I have two" can also be interpreted as "I have at least two, but maybe more." Every version of the problem should say "at least one." Note that Gardner's (And Tanya's, if I recall correctly, after she once asked me to maybe publish some analysis with her) does, but Rosenhouse's does not.

    Horrible wording choice #2: Asking about "the other child" implies that you have identified which child is the one you are talking about. The answer is the same as the versions where you know about the older child. The point is that how identification is accomplished is irrelevant, just the fact that one is. "I have two children, and the one who sits to Mother's right at the dinner table is a boy" makes it a 50% chance that the other child is a boy.

    Merely counting cases is never the correct way to solve a probability problem. It can get the right answer when every case has the same chance, but getting the right answer does not mean the solution is correct. And assuming it is correct can get the wrong answer.

    The Monty Hall Problem is a good example: there are three doors where the car could be, one is eliminated, so the answer is that each door has the same chance. Since they must add up to 100%, each is 50%, right? Wrong. There are effect8ively six cases: Two where the car is behind Door #2 and Monty opens Door #3, two where the car is behind Door #3 and Monty opens Door #2, one where the car is behind door #1 (contestant's door) and Monty opens Door #3, and one where the car is behind door #1 and Monty opens Door #2. Eliminate the ones where Monty opens #2 (since we saw him open #3), and the answer is 2/3.

    This solution is more properly ex[pressed by using a 0% probability he would open #3 if the car is behind #3, a 100% chance if it is behind #2, and a 50% chance if it is behind #1. So the odds are 100%:50%, or 2:1, or a 2/3 probability.

    Any version of the Two Child Problem can only be solved by assessing the probability that we would know about a boy (with property XXX, like "born on Tuesday") in a family with a boy having that property and another child who is not a boy with that property. By merely counting cases, you are assuming that probability is 100%, and you cannot assume that.

    Look up Tanya Khovanova's blog at http://blog.tanyakhovanova.com/2011/05/a-son-named-luigi/

    ReplyDelete
  5. Let me try to emphasize the issue here, with the following generic problem: You meet a man who tells you that the has two child. He then provides additional information that is partially dependent on the gender combination of his two children, but was freely offered. What is the probability that he has a boy and a girl?

    First, notice that I reversed the question to be about two different genders, not two of the same. A lot of people have difficulty accepting that "are both the same?" and "are both boys?" is the same question if what you were told is that there is a boy. But the reversed question is not so difficult, and its answer is always one minus the answer for the more popular version.

    Let K represent the knowledge you have at this point. And use B0, B1, and B2 to represent the cases where the man has zero, one, or two boys, respectively. Note that without the information, Pr(B0)=Pr(B2)=1/4, and Pr(B1)=1/2. The solution is a fairly trivial application of Bayes' Theorem:

    Pr(B1)|K) = Pr(K|B1)*Pr(B1)/Pr(K) = Pr(K|B1)/[2*Pr(K)]

    And note that the law of Total Probability says:

    Pr(K) = Pr(K|B0)*Pr(B0)+Pr(K|B1)*Pr(B1)+Pr(K|B2)*Pr(B2)
    Pr(K) = [Pr(K|B0)+2*Pr(K|B1)+Pr(K|B2)]/4.

    So

    Pr(B1)|K) = 2*Pr(K|B1)/[Pr(K|B0)+2*Pr(K|B1)+Pr(K|B2)]

    So the answer is entirely dependent on the event K, and how it can occur in the various family types. The confusing part, is that K represents the event where you acquire this information, not the event where it is true. The difference is that you can acquire different information even when it is true.

    The importance of this difference is evident in the Monty Hall Problem. If the knowledge K that you have there is the event where a goat is behind the door Monty Hall opened? Then the answer is 1/2. Because there is a goat behind that door in two out of every three games. In one out of those two, the car is behind your original door, and in the other one it is behind the door you'd switch to.

    But the actual information you have is that there is a goat behind that door AND Monty Hall choose it. Say he chooses a door by flipping a coin; he uses it if he needs to, but still flips it if he doesn't so that you don;t know he doesn't need it. So now there are six cases, not three. In two of them, your door has a goat and he opens the door he did. In one of them, your door has the car and he open the door he did. In the other three, he opens a different door. This way, we see that the chances are 1/3 if you stay, and 2/3 if you switch.

    My point is that "K is true" is not necessarily the same event as "you know K." If there are equivalent alternative options - like Monty Hall opening Door #2 instead of Door #3 - you have to assume both could have happened with the same probability, so "you know K" is more restrictive that "K is true."

    In my problem, if K is "I have at least one boy," then the conditional probabilities that K is true are Pr(K|B0)=0, Pr(K|B1)=Pr(K|B2)=1, and the answer is 2/3 (remember, I reversed the question). And if K is "I have at least one girl," the answer is also 2/3. But then, if K is "I'm going to tell you the gender of at least one tomorrow," and the answer is 2/3 regardless of what that gender turns out to be, the answer changes to 2/3 today. This is Bertrand's Box Paradox.

    But there is an alternative to "I have at least one boy" is case B1, namely "I have at least one girl." If we use Pr(K|B1)=1/2, the the answer is 1/2, and the paradox goes away.

    +++++

    You can object to Pr(K|B1)=1/2 in your version #1, because the version is ambiguous. We have no idea how we learned K. But that means no answer is possible, which is what Martin Gardner concluded for that very question.

    In versions #2 and #3, the parent clearly chose information from the set available. Just like Monty Hall clearly choose between whatever goat doors were available. You have to assume Pr(K|B1)=1/2.

    It is only in version #4 that Pr(K|B1)=1.

    ReplyDelete