Analyzing Gaming Results using the Binomial Mean and Standard Deviation
In the previous post, we talked about the mean and standard deviation when analyzing the results of games. We discussed the Central Limit Theorem and how the majority of results of games would fall within the -1σ to 1σ region of a bell-curve.
Here it is again, to jog your memory.
(Source: http://schools-wikipedia.org/)
Getting the mean and standard deviation of data is simple enough if you have a set of data – like when you are analyzing historical records of a current player.
How about if you only have probabilities such as when you are analyzing casino games and assessing a completely new player?
Mean
Let’s use a coin-toss example. Each side of the coin has a 50% chance of appearing or a 0.5 probability. If we tossed that coin 50 times, how many times are we expecting a heads or tails? The calculations are as follows:
Mean = number of trials x probability of winning
Mean (Heads) = 50 x 0.5 = 25
Mean (Tails) = 50 x 0.5 = 25
So, we are expecting that we will get 25 heads and 25 tails from 50 tosses. This sort of calculation is used only for binomial distributions, meaning that the experiments we perform (the coin tosses), have only 2 possible outcomes per experiment. This is similar to most games of chance as well – as in the only 2 possible outcomes being that you either win or lose.
Let’s try a roulette example. The probability of winning on straight up 10 is 1/38 (for a double-zero game). What is the expected mean for the result of 10 from 100 spins?
Mean (Number 10) = 100 x 1/38 = 2.631579 occurrences
Thus, we are expecting that from 100 spins, the number 10 will occur 2.631579 times.
Standard Deviation
Again, note that the calculations we are using are for binomial distributions.
The calculation for the standard deviation of a coin-toss from 50 tosses is = Square root of (the total number of trials x the probability of winning x the probability of losing) = √50 x 0.5 x 0.5 = √12.5 = 3.535534
Thus, we are expecting 50 tosses of the coin to produce 25 heads and 25 tails, but that result will face a deviation of 3.535534.
If we wanted to find out the likelihood of the number of heads or tails occurring, we could do this:
Probability | 2.3% | 13.6% | 34.1% | 50% | 34.1% | 13.6% | 2.3% |
Standard Deviations | -3σ | -2σ | -1σ | Mean | 1σ | 2σ | 3σ |
Result | –3 x 3.535534 +25 = 14.3934 | –2 x 3.535534 +25 = 17.92893 | –1 x 3.535534 +25 = 21.46447 | 25 | 1 x 3.535534 +25 =28.53553 | 2 x 3.535534 +25 =32.07107 | 3 x 3.535534 +25 =35.6066 |
This means that if we conducted multiple experiments where we tossed a coin 50 times, a majority of experiments would have us getting between 21.46447 – 28.53553 tosses of either heads or tails.
However, if experiments resulted in us getting less than 17.92893 or more than 32.07107 heads or tails, that might be more unlikely. Unlikely, but not impossible.
This brings us to…
Outliers and Inter-quartile Ranges
An outlier is a result that well, lies outside of the expected. In some studies, outliers are completely removed from a data set. For us in the business however, it is a cause for concern. We have already talked about how the majority of results in games would fall within the -1σ to 1σ regions. This means that an outlier is worth investigating.
Let’s examine this sample of player results:
Player | Win/Loss |
Player 1 | -84 |
Player 2 | -435 |
Player 3 | 289 |
Player 4 | -373 |
Player 5 | 218 |
Player 6 | -315 |
Player 7 | 500 |
Player 8 | 299 |
If we arranged all the results in ascending order, we’d get this re-arranged list:
Player | Win/Loss |
Player 2 | -435 |
Player 4 | -373 |
Player 6 | -315 |
Player 1 | -84 |
Player 5 | 218 |
Player 3 | 289 |
Player 8 | 299 |
Player 7 | 500 |
We will divide this list into 4 like so, naming each segment as a quartile from Q1 to Q4:
Q1
Player | Win/Loss |
Player 2 | -435 |
Player 4 | -373 |
Q2
Player 6 | -315 |
Player 1 | -84 |
Q3
Player 5 | 218 |
Player 3 | 289 |
Q4
Player 8 | 299 |
Player 7 | 500 |
The end of Q1 (-373) and the end of Q3 (289) is known as the inter-quartile range.
This works out to be 373 + 289 = 662
Anything less than -373 – (1.5 x 662) is an outlier = -1035
Anything more than 289 +(1.5 x 662) is an outlier = 951
Do these values then fall outside of the -1σ to 1σ regions?
Let’s see:
Mean = -435 + -373 + -315 +-84 + 218 + 289 + 299 + 500 / 8 = 12.375
Standard Deviation:
Player | Win/Loss | Win/Loss – Mean | (Win/Loss – Mean)2 | Variance | Standard Deviation |
Player 2 | -435 | -447.375 | 200144.4 | 129122.3 | 359.3359 |
Player 4 | -373 | -385.375 | 148513.9 | ||
Player 6 | -315 | -327.375 | 107174.4 | ||
Player 1 | -84 | -96.375 | 9288.141 | ||
Player 5 | 218 | 205.625 | 42281.64 | ||
Player 3 | 289 | 276.625 | 76521.39 | ||
Player 8 | 299 | 286.625 | 82153.89 | ||
Player 7 | 500 | 487.625 | 237778.1 | ||
Mean | 12.375 |
Probability | 2.3% | 13.6% | 34.1% | 50% | 34.1% | 13.6% | 2.3% |
Standard Deviations | -3σ | -2σ | -1σ | Mean | 1σ | 2σ | 3σ |
Result | –3 x 359.3359 +12.375 = -1065.633 | –2 x 359.3359 +12.375 = -706.2967 | –1 x 359.3359 +12.375 = -346.9609 | 12.375 | 1 x 359.3359 +12.375 =371.7109 | 2 x 359.3359 +12.375 =731.0467 | 3 x 359.3359 +12.375 =1090.383 |
As it turns out -1035 and 951 fall somewhere between the -3σ to -2σ and 3σ to 2σ regions, respectively.
Now, do note that this may not ALWAYS be the case, but it is a good gauge of what we want to consider ‘suspicious’.
Call surveillance!