7.2 The Central Limit Theorem for Sums

Tim Contreras, Odessa College

34 7.2 The Central Limit Theorem for Sums

Suppose X is a random variable with a distribution that may be known or unknown (it can be any distribution) and suppose:

μ = the mean of Χ
σ = the standard deviation of X

If you draw random samples of size n, then as n increases, the random variable Σx consisting of sums tends to be normally distributed such that [latex]\displaystyle\sum{x}{\sim}{N}[{{({n})}{({\mu})},{(\sqrt{{n}})}{({\sigma})}}][/latex].

The central limit theorem for sums says that if you keep drawing larger and larger samples and taking their sums, the sums form their own normal distribution (the sampling distribution), which approaches a normal distribution as the sample size increases. The normal distribution has a mean equal to the original mean multiplied by the sample size and a standard deviation equal to the original standard deviation multiplied by the square root of the sample size.

The random variable Σx has the following z-score associated with it:

Σx is one sum.
[latex]{z}=\frac{{\sum{x} - {({n})}{({\mu})}}}{{{(\sqrt{{n}})}{({\sigma})}}}[/latex] or [latex]\sum{x} = (n)({{\mu}})+({z})({{\sigma}})(\sqrt{n})[/latex]
(Do not memorize both formula. They are same! )

[latex]{({n})}{({\mu})}[/latex] = [latex]\displaystyle{\mu}_{\sum{x}}[/latex] , the mean of [latex]\sum{x}[/latex]
([latex]\sqrt{n}[/latex])(σ) = [latex]\displaystyle{\sigma}_{\sum{x}}[/latex] , the standard deviation of [latex]\sum{x}[/latex]

Guide for TI-Calculator

To find probabilities for sums on the calculator, follow these steps.

2nd
DISTR
2: normalcdf
normalcdf ( lower value of the area, upper value of the area, n * mean, [latex]\displaystyle\sqrt{{n}}[/latex] * standard deviation )
where: mean is the mean of the original distribution standard deviation is the standard deviation of the original distribution sample size = n

Example 1

An unknown distribution has a mean of 90 and a standard deviation of 15. A sample of size 80 is drawn randomly from the population.

Find the probability that the sum of the 80 values (or the total of the 80 values) is more than 7,500.
Find the sum that is 1.5 standard deviations above the mean of the sums.

Solution

Let X = one value from the original unknown population.
The probability question asks you to find a probability for the sum (or total of) 80 values.

[latex]\displaystyle\sum{x}[/latex] = the sum or total of 80 values, [latex]\displaystyle{\mu}=90,{\sigma}=15[/latex], and n = 80,

mean of the sums, [latex]\displaystyle{\mu}_{\sum{x}}[/latex] = (n)(μ) = (80)(90) = 7,200
standard deviation of the sums, [latex]\displaystyle{\sigma}_{\sum{x}}[/latex] = [latex]\displaystyle{(\sqrt{{n}})}{({\sigma})}={(\sqrt{{80}})}{({15})}[/latex]

Therefore, [latex]\sum{x}[/latex]~ N((80)(90), ([latex]\displaystyle\sqrt{{80}}[/latex])(15)).

Probability that the sum of the 80 values (or the total of the 80 values) is more than 7,500
= P(Σx > 7,500)
= Shaded area

TI-Calculator: normalcdf (7500, 1E99, (80)(90), [latex]\displaystyle{(\sqrt{{80}})}[/latex](15)) = 0.0127
Therefore, P(Σx > 7,500) = 0.0127
Find Σx where z = 1.5.
[latex]\displaystyle{\sum{x}}={(n)}{({\mu})}+{(z)}{(\sqrt{n})}{({\sigma}})=(80)(90)+(1.5)(\sqrt{80})(15)=7401.2[/latex]

Try It

An unknown distribution has a mean of 45 and a standard deviation of 8. A sample size of 50 is drawn randomly from the population. Find the probability that the sum of the 50 values is more than 2,400.
[practice-area rows=”1″][/practice-area]

Show Answer

[latex]\displaystyle\sum{x}[/latex] = the sum / total of 50 values, [latex]\displaystyle{\mu}=45,{\sigma}=8[/latex], and n = 50,

mean of the sums = (n)(μ) = (50)(50)
standard deviation of the sums =[latex]\displaystyle{(\sqrt{{n}})}{({\sigma})}={(\sqrt{{50}})}{({8})}[/latex]

TI-Calculator: normalcdf (2400, 1E99, (50)(50), ([latex]\sqrt{50}[/latex])(8) )
Probability (the sum of the 50 values is more than 2,400) = 0.0040

Guide for TI-Calculator

To find percentiles for sums on the calculator, follow these steps.

2nd
DIStR
3: invNormk = invNorm (area to the left of k, (n)([latex]\displaystyle{\mu}[/latex]), ([latex]\sqrt{n}[/latex])([latex]\displaystyle{\sigma}[/latex]))where:
k is the kth percentile,
[latex]\displaystyle{\mu}[/latex] is the mean of the original distribution,
[latex]\displaystyle{\sigma}[/latex] is the standard deviation of the original distribution,
sample size = n

Example 2

In a recent study reported Oct. 29, 2012 on the Flurry Blog, the mean age of tablet users is 34 years. Suppose the standard deviation is 15 years. The sample of size is 50.

What are the mean and standard deviation for the sum of the ages of tablet users?
What is the distribution?
Find the probability that the sum of the ages is between 1,500 and 1,800 years.
Find the 80th percentile for the sum of the 50 ages.

Solution

In this example, [latex]\displaystyle{\mu}[/latex] = 34, [latex]{\sigma}=15[/latex], and n = 50,

mean of the sums, [latex]\displaystyle{\mu}_{\sum{x}}[/latex] = (n)(μ) = (50)(34)
standard deviation of the sums, [latex]\displaystyle{\sigma}_{\sum{x}}[/latex] =[latex]\displaystyle{(\sqrt{{n}})}{({\sigma})}={(\sqrt{{50}})}{({15})}[/latex]

[latex]\displaystyle{\mu}_{\sum{x}}={n}{\mu}={50(34)}={1,700}[/latex] and [latex]\displaystyle{\sigma}_{\sum{x}}=(\sqrt{n})({\sigma})=({\sqrt{50}})({15})={106.01}[/latex]
The distribution for the sum of ages of tablet users is normal by the central limit theorem.
TI-Calculator: normalcdf(1500, 1800, (50)(30), ([latex]\displaystyle{\sqrt{50}}[/latex])(15))
P(1500 < [latex]\displaystyle{\sum{x}}[/latex] < 1800) = 0.7974
Let k = the 80th percentile.
TI-Calculator: invNorm(0.80, (50)(34), ([latex]\displaystyle{\sqrt{50}}[/latex]) (15))
k =1789.3

Try It

In a recent study reported Oct.29, 2012 on the Flurry Blog, the mean age of tablet users is 35 years. Suppose the standard deviation is 10 years. The sample size is 39.

What are the mean and standard deviation for the sum of the ages of tablet users?
[practice-area rows=”1″][/practice-area]

Show Answer

[latex]\displaystyle{\mu}_{\sum{ x }} = {n}{\mu} = {1365}[/latex] and [latex]{\sigma}_{\sum{X }} = {\sigma}=({\sqrt{n{\sigma}_{x }}})({ 15 })={62.4}[/latex].
What is the distribution?
[practice-area rows=”1″][/practice-area]

Show Answer

The distribution is normal for sums by the central limit theorem.
Find the probability that the sum of the ages is between 1,400 and 1,500 years.
[practice-area rows=”1″][/practice-area]

Show Answer

TI-Calculator: normalcdf(1400, 1500, (39)(35), 10)
P(1400 < [latex]\displaystyle\sum{x}[/latex] < 1500) = 0.2723
Find the 90th percentile for the sum of the 39 ages.
[practice-area rows=”1″][/practice-area]

Show Answer

Let k = the 90th percentile.
TI-Calculator: invNorm(0.90,(39)(35), ([latex]\sqrt{39}[/latex])(10))
k = 1445.0

Example 3

The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of size 70.

What are the mean and standard deviation for the sums?
Find the 95th percentile for the sum of the sample. Interpret this value in a complete sentence.
Find the probability that the sum of the sample is at least ten hours.

Solution

In this example, [latex]\displaystyle{\mu}[/latex] = 8.2 minutes, [latex]{\sigma}[/latex] = 1 minute, and n = 70,

mean of the sums, [latex]\displaystyle{\mu}_{\sum{x}}[/latex] = (n)(μ) = (70)(8.2)
standard deviation of the sums, [latex]\displaystyle{\sigma}_{\sum{x}}[/latex] =[latex]\displaystyle{(\sqrt{{n}})}{({\sigma})}={(\sqrt{{70}})}{({1})}[/latex]

[latex]\displaystyle{\mu}_{\sum{x}}= ({n})({\mu}) = {70(8.2)} = 574\text{ minutes }[/latex] and [latex]\displaystyle{\sigma}_{\sum{X}}[/latex] = [latex]({\sqrt{n}})({\sigma})[/latex] = [latex]({\sqrt{70}}){(1)}[/latex] = 8.37 minutes.
Let k = the 95th percentile.,
TI-Calculator: invNorm (0.95,(70)(8.2),([latex]\displaystyle\sqrt{70}[/latex])(1))
k = 587.76 minutes.
95% of the app engagement times are at most 587.76 minutes.
Covert 10 hours into 600 minutes.
TI-Calculator: normalcdf(600, 1E99,(70)(8.2), ( [latex]\displaystyle\sqrt{70}[/latex] )(1))
P (the sum of the sample is at least ten hours) = P( [latex]\displaystyle\sum{x}[/latex] > 600 minutes) =0.0009

Example 4

The mean number of minutes for app engagement by a table use is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample size of 70.

What is the probability that the sum of the sample is between seven hours and ten hours? What does this mean in context of the problem?
Find the 16th and 84th percentiles for the sum of the sample. Interpret these values in context.

Solution

In this example, [latex]\displaystyle{\mu}[/latex] = 8.2 minutes, [latex]{\sigma}[/latex] = 1 minute, and n = 70,

mean of the sums, [latex]\displaystyle{\mu}_{\sum{x}}[/latex] = (n)(μ) = (70)(8.2)
standard deviation of the sums, [latex]\displaystyle{\sigma}_{\sum{x}}[/latex] =[latex]\displaystyle{(\sqrt{{n}})}{({\sigma})}={(\sqrt{{70}})}{({1})}[/latex]

7 hours = 420 minutes and 10 hours = 600 minutes
TI-Calculator: normalcdf(420, 600, (70)(8.2), [latex]\displaystyle\sqrt{70}(1))[/latex]
P (the sum of the sample is between seven hours and ten hours)
= P(420[latex]\displaystyle\leq{\sum{x}}\leq{600}[/latex])
= 0.9991.
This means that for this sample sums there is a 99.91% chance that the sums of usage minutes will be between 420 minutes and 600 minutes.
To find the 16th percentile, TI-Calculator: invNorm (0.16,(70)(8.2),([latex]\displaystyle\sqrt{70}[/latex])(1))=565.68 minutes.
To find the 84th percentile, TI-Calculator: invNorm (0.84,(70)(8.2),([latex]\displaystyle\sqrt{70}[/latex])(1))=582.32 minutes.Since 84% of the app engagement times are at most 582.32 minutes and 16% of the app engagement times are at most 565.68 minutes, we may state that 68% of the app engagement times are between 565.68 minutes and 582.32 minutes.

References

Farago, Peter. “The Truth About Cats and Dogs: Smartphone vs Tablet Usage Differences.” The Flurry Blog, 2013. Posted October 29, 2012. Available online at http://blog.flurry.com (accessed May 17, 2013).

Concept Review

The central limit theorem tells us that for a population with any distribution, the distribution of the sums for the sample means approaches a normal distribution as the sample size increases. In other words, if the sample size is large enough, the distribution of the sums can be approximated by a normal distribution even if the original population is not normally distributed. Additionally, if the original population has a mean of μ_X and a standard deviation of σ_x, the mean of the sums is nμ_x and the standard deviation is [latex]\displaystyle(\sqrt{n})({\sigma}_{x})[/latex] where n is the sample size.

Formula Review

The Central Limit Theorem for Sums: [latex]\displaystyle\sum{X}{\sim}{N}[{{({n})}{({\mu})},{(\sqrt{{n}})}{({\mu})}}][/latex]

The Central Limit Theorem for Sums z-score and standard deviation for sums:

z for the sample mean of the sums: z = [latex]\frac{{\sum{x}}-({n})({\mu})}{({\sqrt{n}})({\sigma})}[/latex]
Mean for Sums, [latex]{\mu}_{\sum{x}}[/latex] = [latex]({n})({\mu}_{x})[/latex]
Standard deviation for Sums, [latex]{\sigma}_{\sum{x}}[/latex] = [latex]({\sqrt{n}})({\sigma}_{x})[/latex]

License

Icon for the Creative Commons Attribution 4.0 International License

Guide for TI-Calculator

Example 1

Solution

Try It

Example 2

Solution

Try It

Example 3

Solution

Example 4

Solution

References

Concept Review

Formula Review

License

Share This Book