Is Your New Player Suspicious? Scope him out using the Binomial Mean and Standard Deviation

Analyzing Gaming Results using the Binomial Mean and Standard Deviation

In the previous post, we talked about the mean and standard deviation when analyzing the results of games. We discussed the Central Limit Theorem and how the majority of results of games would fall within the -1σ to 1σ region of a bell-curve.

Here it is again, to jog your memory.

bellcurve

(Source: http://schools-wikipedia.org/)

Getting the mean and standard deviation of data is simple enough if you have a set of data – like when you are analyzing historical records of a current player.

How about if you only have probabilities such as when you are analyzing casino games and assessing a completely new player?

Mean

Let’s use a coin-toss example. Each side of the coin has a 50% chance of appearing or a 0.5 probability.  If we tossed that coin 50 times, how many times are we expecting a heads or tails?  The calculations are as follows:

Mean = number of trials x probability of winning

Mean (Heads) = 50 x 0.5 = 25

Mean (Tails) = 50 x 0.5 = 25

So, we are expecting that we will get 25 heads and 25 tails from 50 tosses. This sort of calculation is used only for binomial distributions, meaning that the experiments we perform (the coin tosses), have only 2 possible outcomes per experiment. This is similar to most games of chance as well – as in the only 2 possible outcomes being that you either win or lose.

Let’s try a roulette example. The probability of winning on straight up 10 is 1/38 (for a double-zero game).  What is the expected mean for the result of 10 from 100 spins?

Mean (Number 10) = 100 x 1/38 = 2.631579 occurrences

Thus, we are expecting that from 100 spins, the number 10 will occur 2.631579 times.

Standard Deviation

Again, note that the calculations we are using are for binomial distributions.

The calculation for the standard deviation of a coin-toss from 50 tosses is = Square root of (the total number of trials x the probability of winning x the probability of losing) = √50 x 0.5 x 0.5 = √12.5 = 3.535534

Thus, we are expecting 50 tosses of the coin to produce 25 heads and 25 tails, but that result will face a deviation of 3.535534.

If we wanted to find out the likelihood of the number of heads or tails occurring, we could do this:

Probability 2.3% 13.6% 34.1% 50% 34.1% 13.6% 2.3%
Standard Deviations -3σ -2σ -1σ Mean
Result 3 x 3.535534 +25 = 14.3934 2 x 3.535534 +25 = 17.92893 1 x 3.535534 +25 = 21.46447 25 1 x 3.535534 +25 =28.53553 2 x 3.535534 +25 =32.07107 3 x 3.535534 +25 =35.6066

This means that if we conducted multiple experiments where we tossed a coin 50 times, a majority of experiments would have us getting between 21.46447 – 28.53553 tosses of either heads or tails.

However, if experiments resulted in us getting less than 17.92893 or more than 32.07107 heads or tails, that might be more unlikely.  Unlikely, but not impossible.

This brings us to…

Outliers and Inter-quartile Ranges

An outlier is a result that well, lies outside of the expected.  In some studies, outliers are completely removed from a data set. For us in the business however, it is a cause for concern.  We have already talked about how the majority of results in games would fall within the -1σ to 1σ regions.  This means that an outlier is worth investigating.

Let’s examine this sample of player results:

Player Win/Loss
Player 1 -84
Player 2 -435
Player 3 289
Player 4 -373
Player 5 218
Player 6 -315
Player 7 500
Player 8 299

If we arranged all the results in ascending order, we’d get this re-arranged list:

Player Win/Loss
Player 2 -435
Player 4 -373
Player 6 -315
Player 1 -84
Player 5 218
Player 3 289
Player 8 299
Player 7 500

We will divide this list into 4 like so, naming each segment as a quartile from Q1 to Q4:

Q1

Player Win/Loss
Player 2 -435
Player 4 -373

Q2

Player 6 -315
Player 1 -84

Q3

Player 5 218
Player 3 289

Q4

Player 8 299
Player 7 500

The end of Q1 (-373) and the end of Q3 (289) is known as the inter-quartile range.

This works out to be 373 + 289 = 662

Anything less than -373 – (1.5 x 662) is an outlier = -1035

Anything more than 289 +(1.5 x 662) is an outlier = 951

Do these values then fall outside of the -1σ to 1σ regions?

Let’s see:

Mean = -435 + -373 + -315 +-84 + 218 + 289 + 299 + 500 / 8 = 12.375

Standard Deviation:

Player Win/Loss Win/Loss – Mean (Win/Loss – Mean)2 Variance Standard Deviation
Player 2 -435 -447.375 200144.4 129122.3 359.3359
Player 4 -373 -385.375 148513.9
Player 6 -315 -327.375 107174.4
Player 1 -84 -96.375 9288.141
Player 5 218 205.625 42281.64
Player 3 289 276.625 76521.39
Player 8 299 286.625 82153.89
Player 7 500 487.625 237778.1
Mean 12.375
Probability 2.3% 13.6% 34.1% 50% 34.1% 13.6% 2.3%
Standard Deviations -3σ -2σ -1σ Mean
Result 3 x 359.3359 +12.375 = -1065.633 2 x 359.3359 +12.375 = -706.2967 1 x 359.3359 +12.375 = -346.9609 12.375 1 x 359.3359 +12.375 =371.7109 2 x 359.3359 +12.375 =731.0467 3 x 359.3359 +12.375 =1090.383

As it turns out -1035 and 951 fall somewhere between the -3σ to -2σ and 3σ to 2σ regions, respectively.

Now, do note that this may not ALWAYS be the case, but it is a good gauge of what we want to consider ‘suspicious’.

Call surveillance!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s