## Binomial Probability Distribution and the Battle of Gettysburg

This post shows how I use my Battle of Gettysburg Trivia Quiz to help teach the binomial probability distribution. My brother Roger and I are Civil War buffs, and we recently paid another of our many visits to the Gettysburg Military Park Battlefield in Gettysburg, Pennsylvania. On each visit to a Civil War battlefield, I’m rewarded with another gold nugget of information about a particular battle that relatively few people know. After several visits to the Gettysburg battlefield, I created a Gettysburg Trivia Quiz to demonstrate the binomial probability distribution in my statistics classes at the junior college where I have taught a variety of math courses over the years. I consider the binomial probability distribution to be the most important discrete probability distribution in statistics. (The normal probability distribution is the most important probability distribution in all of statistics.)

You can download the Gettysburg Trivia Quiz by clicking the links below. I believe you’ll find it fun, interesting and helpful to take the quiz before continuing.

Gettysburg Trivia Quiz (student version)

Gettysburg Trivia Quiz (teacher version)

You can also download the quiz on our free instructional content page under the statistics tab.

The features of the quiz and how I administer the quiz are as follows:

• The quiz has 20 multiple choice questions.
• Each question has 5 choices.
• Each question is designed so that no normal person would have any idea of what the correct answer is. Hence a normal test taker can only guess at the correct answer, and therefore has a 0.2 probability of answering any question correctly.
• All questions are independent of each other in that knowledge of any one question can’t help answer any other quiz question.
• On the day of the lesson, I announce that the class will be taking 20-question multiple choice trivia quiz on the battle of Gettysburg.
• I distribute a copy of the quiz to each student, and tell them there is no reason to take a peek at a neighbor’s answer because their quiz score has no influence on their course grade.
• I tell them that the intent of the quiz is to eventually help them better understand the binomial probability distribution. Most likely there will be no high scores.
• I allow 5-10 minutes to take the quiz. (You will observe intense expressions on some faces.)
• I verbally provide the correct answers so that each student can grade his/her own quiz. I ask students to be honest when grading their quiz. There should be no surprise if almost all students have a very low score.
• As each student reports his/her quiz score, I record the score on the chalkboard.
• I then begin a discussion of the binomial probability distribution, introduced previously, and how the binomial distribution can be used to predict the probability distribution of scores from the Gettysburg Trivia Quiz. The class will see that the expected average or mean score is 4, and an unusually high score is 8 or more. A score of 8 is more than 2 standard deviations above the expected mean score of 4, where the expected standard deviation of the scores equals 1.789.
• Using our calculators, the probability of a score of 0 = 0.0115, probability of a score of 4 = 0.218, and the probability of a score of 8 or more = 0.0321.
• We then compare the experimental binomial results from the quiz with the expected binomial results. Especially with a larger class, students are amazed at how the expected and experimental results agree.

My free handout, Summary of Common Probability Distributions, describes the key properties and some applications of the probability distributions found in lower level college statistics courses. You can also download the handout on our free instructional content page under the statistics tab. Teachers can copy parts or all of the handout to share with students. Here is a general description of the key components of a binomial probability distribution:

• A Bernoulli trial is any experiment that has exactly two possible outcomes. Examples: 1) Tossing a single coin has outcomes of head or tail. 2) Answering a multiple choice test question has outcomes of correct answer or incorrect answer. 3) Consider the chances of a person living to retirement age of 65. That person can die before 65, or age 65 or after.
• The two possible outcomes of a Bernoulli trial are called ‘success’ or ‘failure’. The terms success and failure are just labels that represent the two possible outcomes of a Bernoulli trial.
• Each outcome of a Bernoulli trial has no influence on subsequent Bernoulli trials. In other words, Bernoulli trials are independent events.
• The random variable x of a binomial population equals the number of successes found in n Bernoulli trials repeated under identical conditions. Therefore x could equal any of the integers that range from 0 to n.
• The probability of success on each trial is always the same and is denoted by p. The probability of failure on each trial is denoted by q. The laws of probability tell us that p + q = 1 or q = 1 – p.
• The expected average or mean of random variable x equals μ = np.
• The expected standard deviation of random variable x equals σ = √(n*p*q).
• The formula for the probability of exactly x successes in n trials is P(x) = nCx * px * q(n-x).

Now let’s see how the binomial probability distribution can be used to predict the key statistical properties of the sample of quiz scores. Note that the number of test takers has no influence of the expected values for μ and σ below. As the number test takers increases, we can expect increased agreement between expected and experimental results.

• Each quiz question is a Bernoulli trial; the answer is correct or incorrect.
• The trivia quiz is composed of 20 independent Bernoulli trials with p = 0.2 and q = 0.8 because we are assuming that normal students don’t know any of the details of the battle of Gettysburg.
• The number of successive independent Bernoulli trials equals n = 20.
• Let the random variable x equal the number of correct answers reported by test takers.
• The expected mean or expected class average score μ = np = 20*0.2 = 4 correct answers.
• The expected standard deviation of quiz scores σ = √(npq) = √(20*0.2*0.8) = 1.789.
• An unusually high score = μ + 2σ = 4 + 2*1.789 = 7.58 = a score of 8 or more.
• Calculate the sample mean x-bar and sample standard deviation s of the reported scores, and then compare these values with μ and σ.
• P(x = 0) = 0.0115, P(x = 4) = 0.218), and P(x ≥ 8) = 0.0321. Compare these probabilities to the experimental probabilities derived from the reported quiz scores. Example: What is the experimental probability that a student got a score of 4?

Based on the Gettysburg Trivia Quiz, here are the results of two probability simulations of a binomial distribution where the number of Bernoulli trials n = 20 and the probability of success on each trial p = 0.2. The widths of the histograms bars = 1 unit. Both the heights of the blue bars and the area of the blue bars equals P(x) = the expected probability of a student having exactly x successes in 20 Bernoulli trials. The heights of the red bars and the area of the red bars equals the experimental probability of a student having exactly x successes in 20 Bernoulli trials. The first graph shows the results of 50 students taking the trivia quiz, and the second graph shows the results of 50,000 students taking the quiz. Every simulation run on any number quiz takers will give us different experimental results, but the same expected results.

Using the same simulation parameters for the two graphs above, the graph below shows the cumulative results after a run of 30 simulations in manual mode where the user is required to press the <R> key to run the next simulation. The output of each simulation in manual mode shows us how well a test taker did on the quiz. The graph below includes a string of S’s and F’s that indicate successes and failures for test taker number 30 in 20 Bernoulli trials. The probability simulation software used to create the histograms allows me to find expected and experimental probabilities of events by just moving the mouse cursor over a histogram bar, and then shift-left clicking the mouse. The software also makes it easy to find probabilities such as P(x ≤ 5) or P(x ≥ 2).

My Probability Simulations software was used to create the graphics in this post. I will be releasing a full and free version of this software in the near future at the Math Teacher’s Resource website. The software release will be announced in a future post. Stay tuned!

I will close this post by telling you a true story about Larry (not his real name) who was in a general education high school math class that I taught approximately 20 years ago. Larry had very little mathematical ability. Larry could learn how to solve a specific type of math problem when the problem was presented in a specific format. If the presentation of a math problem was modified in any way, Larry was lost.

Like myself, Larry was a Civil War buff. Many of his classmates thought Larry was a history genius because he could spout a wide variety of American history facts. Larry could tell you that Private Hugh White was the lone British sentry on guard duty near the British Customs House on Monday, March 5th, 1770, the night of the Boston Massacre. I loved to share Civil War stories with Larry. He liked to ask trivia questions to stump me. It turns out, Larry, his father and I also had the same barber, Joe. About 6 years after Larry graduated, I asked Joe to have Larry take my Gettysburg Trivia Quiz the next time he got a haircut. I gave Joe the answer key so that he could grade the quiz while Larry was still in the shop. I told Joe that if Larry scored 19/20 or 20/20 on the quiz, I would pay for his haircut. Larry scored 19/20! The haircut and tip were on me. Joe told me how excited Larry became, and how proud Larry’s father was. I guess our brains are wired differently.