Subjective, Empirical, and Computational Probability for Coin Toss Experiment

This simulation is designed to demonstrate the difference between subjective, empirical, and computational probabilities. A simulation was used that records the result of flipping a coin ten times concentrating on the number of heads obtained. This simulation was run sixty times and the results were entered into an Excel spreadsheet for analysis. First subjective probabilities were generated from educated guesswork. Then empirical probabilities were generated based on the results of the simulations. Finally computational probabilities were calculated using the binomial distribution formula. Both the subjective and the empirical probabilities were compared with the computational probabilities in order to emphasize the differences.

To begin with there are several questions regarding the subjective probabilities associated with the eleven possible results. Since five heads will occur more frequently than any other combination, it should be significantly greater than the average value of 9.1%. The assigned value for obtaining five heads is 22%. Since the binomial distribution is symmetric the probability of getting three heads will be the same as the probability of getting seven heads. Knowing that the total number of heads must be between zero and 10 and also that the sum of all the percentages must equal 100% exactly these were assigned a subjective probability of 12%. Again since the binomial distribution is symmetric the probability of getting no heads will be the same as getting all heads. The subjective probability assigned to both of these results was 1%.

From this point the remainder of the subjective probability chart was filled out knowing that the distribution was symmetric, each value had to be between 0 and 1, and the sum of all the probabilities had to be exactly 1. After a little bit of experimentation the table at the top of the following page was generated:

X = Heads 0 1 2 3 4 5 6 7 8 9 10
Prob (X) 0.01 0.02 0.05 0.12 0.19 0.22 0.19 0.12 0.05 0.02 0.01

Examination by eye confirms that the distribution is symmetric and each value is within the allowed range. For the final requirement it is readily confirmed that

2 (0.01 + 0.02 + 0.05 + 0.12 + 0.19) + 0.22 = 1.00

and an acceptable subjective probability distribution has been generated.

Assuming that a fair coin is being used, the probability of getting heads on any toss will be 0.50 as will be the probability of getting tails. This means that over an extended period of time half of the tosses are expected to be heads and half of them are expected to be tails. As a result for a total of 600 tosses the expected value for heads is 300. Also since the probabilities of success (heads) and failure (tails) are equal this means that the probability of getting three heads will be exactly equal to the probability of getting three tails. Finally, the probability of getting three heads is exactly equal to the probability of getting seven tails. This is due to the fact that this is a binomial distribution and the only possibilities are heads or tails. If exactly three heads are generated then the other seven tosses must have resulted in tails.

The first graph presented for discussion is a line graph showing the evolution of cumulative heads percentage over all 60 simulations. It would be expected that this cumulative percentage would approach the success probability over a long enough timeline. Since the probability of success in this case (assuming a fair coin) is 0.50 the line graph should approach this value with decreasing fluctuations as the number of simulations increases. This line graph is presented at the top of the following page:

The sharp jump at the beginning simply represents an abundance of heads after a short number of tosses. Around the 25th toss the cumulative probability returned to its expected value and then dipped below 0.50 showing an overall abundance of tails. As expected the line graph continued to fluctuate around 0.50 with the degree of fluctuation decreasing as the total number of simulations increased.

The second graph is a comparison of the subjective probabilities to the computational probabilities calculated. The subjective probabilities can be thought of as a gut instinct while the computational probabilities are exact based on the binomial distribution formula. This graph is presented at the top of the following page:

 

The subjective probabilities are shown in blue and the computational probabilities are shown in red. For this particular subjective probability distribution the central values were underestimated while the outer values were overestimated. Notice that both distributions are symmetric as would be expected. The overall logic error in the subjective probability distribution appears to be a misunderstanding of how quickly the probability decreases as the number of successes deviates in either direction from the expected value of five. Also notice that the standard deviation of the subjective distribution is larger than that of the computational distribution.

The final graph is a comparison of the empirical probabilities to the computational probabilities. The empirical probabilities are determined from the sixty simulations: empirical probability distributions are expected to approach the computational distribution as the number of trials increases. Again this graph is presented at the top of the following page:

 

Interestingly enough the empirical distribution also appears to underestimate the computational distribution for the central values and overestimate it for the outer values. Notice here that the empirical distribution is no longer symmetric, and there is no reason to expect that it would be. It appears that sixty simulations are enough for the empirical distribution to be recognizable compared with the computational distribution. In other words the empirical distribution is a reasonable approximation but is not exact.

In conclusion this experiment is very good for showing the differences between the three types of probability distributions. It is simple to execute and the fact that the probabilities of success and failure are equal make the subjective distribution easier to analyze. All three distributions are relatively similar and all of the results were as expected.

 

APPENDIX A

The first thing that I learned was that there is actually some reasoning that goes into determining a subjective distribution. Before this experiment I felt it was a mostly useless exercise, especially if the computational distribution was readily available. Now I realize that with a bit of logic it is possible and actually not too difficult to generate a reasonable subjective distribution. The advantage of this is that for situations where the computational distribution is more difficult a ballpark idea can be gathered with relative simplicity.

The next thing that I gained from this project is a better interpretation of the line graph for cumulative percentage of heads. I knew that it would approach the expected value, but I never really thought about what the fluctuations really meant. No I understand that when the cumulative probability is greater than the expected value that there is an abundance of successes while when the cumulative probability is less than the expected value there is an abundance of failures. The last thing that I learned was that a relatively small number of simulations can generate a reasonable empirical distribution. With eleven different possible outcomes I would have thought that considerably more than sixty simulations would have been needed.

On a slightly different note I also learned some new functions for the Excel spreadsheets. The most interesting one to me was the command to generate the exact probabilities for a binomial distribution. All the other functions I used were familiar to me, but it was good practice to use them again. I feel like this project helped me learn and reaffirm quite a bit of knowledge related to probability distributions and the use of Excel to analyze them.

 

APPENDIX B

Toss Number Cumulative Total Coins Cumulative
Number Heads Number Heads Tossed Percent Heads
1 5 5 10 0.5000
2 7 12 20 0.6000
3 5 17 30 0.5667
4 6 23 40 0.5750
5 4 27 50 0.5400
6 7 34 60 0.5667
7 3 37 70 0.5286
8 4 41 80 0.5125
9 6 47 90 0.5222
10 8 55 100 0.5500
11 6 61 110 0.5545
12 5 66 120 0.5500
13 6 72 130 0.5538
14 5 77 140 0.5500
15 3 80 150 0.5333
16 3 83 160 0.5188
17 7 90 170 0.5294
18 5 95 180 0.5278
19 6 101 190 0.5316
20 2 103 200 0.5150
21 6 109 210 0.5190
22 5 114 220 0.5182
23 3 117 230 0.5087
24 2 119 240 0.4958
25 6 125 250 0.5000
26 2 127 260 0.4885
27 8 135 270 0.5000
28 4 139 280 0.4964
29 3 142 290 0.4897
30 7 149 300 0.4967
31 5 154 310 0.4968
32 3 157 320 0.4906
33 4 161 330 0.4879
34 6 167 340 0.4912
35 9 176 350 0.5029
36 5 181 360 0.5028
37 7 188 370 0.5081
38 6 194 380 0.5105
39 4 198 390 0.5077
40 2 200 400 0.5000
41 5 205 410 0.5000
42 8 213 420 0.5071
43 4 217 430 0.5047
44 5 222 440 0.5045
45 5 227 450 0.5044
46 6 233 460 0.5065
47 8 241 470 0.5128
48 7 248 480 0.5167
49 4 252 490 0.5143
50 4 256 500 0.5120
51 0 256 510 0.5020
52 4 260 520 0.5000
53 7 267 530 0.5038
54 6 273 540 0.5056
55 5 278 550 0.5055
56 2 280 560 0.5000
57 3 283 570 0.4965
58 5 288 580 0.4966
59 4 292 590 0.4949
60 7 299 600 0.4983

 

Table 1.

Simulation number, Total number of heads, Cumulative number of heads
Total number of coins tossed, Cumulative percentage of heads

 

 

Number Subjective Actual Empirical Computational
Successes Probability Successes Probability Probability
0 0.010 1 0.017 0.001
1 0.020 0 0.000 0.010
2 0.050 5 0.083 0.044
3 0.120 7 0.117 0.117
4 0.190 10 0.167 0.205
5 0.220 13 0.217 0.246
6 0.190 11 0.183 0.205
7 0.120 8 0.133 0.117
8 0.050 4 0.067 0.044
9 0.020 1 0.017 0.010
10 0.010 0 0.000 0.001
Sums 1.000 60.000 1.000 1.000

 

 

Table 2.

Number of successes, Subjective probability, Actual successes

Empirical probability, Computational probability