THE FOLLOWING NOTES USE EXTENSIVE MS Word Math Equations that will NOT transfer over to HTML.
Chapter 7 Normal Distribution
Continuous sample
space and continuous random variable
occur when dealing with quantities that are measured on continuous scale.
Normal distribution ~ normal distribution curves à AKA bell-shaped
distribution, BUT NOT all bell-shaped distributions are normal distributions (A
à B, BUT NOT B à A)
7.1 Continuous Distributions
Previously, frequencies, %’s/probabilities represented by heights
of rectangles and IF class intervals
are all equal frequencies, %’s/probabilities also represented by areas of
rectangles.
For continuous curves
– probabilities represented by areas under the curve.
Continuous curves = graphs of functions that are referred to as probability densities/continuous
distribution.
è
The area under the curve between any two
values a and b gives the probability
that a random variable having this continuous distribution will take on a value
on the interval from a to b.
è
Area under the curve between a and b =
è
Values of probability density should NOT be
negative.
è
Total area under the curve is always equal to 1.
Based on the math of area under a curve (for the image
above) à
The probability of a specific value
is = 0 b/c for a specific x value
with a y range, the area under the
curve (as calculated using rectangles) is (length*width) à length = y and width =
(x-x) = 0 à
(y*0) = 0 à Paradox since specific values must have a probability.
Continuous
distributions approximated by histograms
of probability distributions.
Histograms with narrower and narrower classes à mean and standard
deviation of probability distribution will approach continuous distribution.
Continuous distribution
mean (µ) = measure of center or middle
Continuous
distribution standard deviation (σ) = measure of dispersion or spread
7.2 Normal Distribution
Cornerstone of modern statistics – role in development of
statistical theory and its alignment with observed data
* Area under the curve becomes negligible greater than 4 or 5 standard deviations away.
* Area under the curve becomes negligible greater than 4 or 5 standard deviations away.
There is one and only
one normal distribution with a given mean
µ and a given standard deviation σ. à increase/decrease the mean = move the curve
right(+) or left(-), increase/decrease the standard deviation = flatten(+) or
sharpen(-) the curve.
Standard
normal distribution = normal-curve areas for µ = 0 and σ = 1.
Can obtain areas under any normal curve by changing scale and converting into standard units with formula:
It’s the good o’ z score repurposed here à tells how many
standard deviations from the mean.
Assuming a standard
normal distribution, z scores can be used to find area under curve with
ease using a table with values. The z scores of the limits of the interval
region are used to obtain the probability, which are then subtracted to obtain
the probability of the region.
The z scores for a normal distribution also follow the 1σ=68%,
2σ=95%, 3σ=99.7% rule.
Using R:
Rnorm = Generates random #s from normal distribution à
rnorm(n,mean,sd,log=F/T)
Dnorm = Probability Density Function à dnorm(x,mean,sd, lower.tail=T/F,
log.p=F/T
Pnorm = Cumulative Distribution Function à pnorm(x,mean,sd,
lower.tail=T/F, log.p=F/T)
Qnorm = Quantile Function – inverse of pnorm à qnorm(p,mean,
sd)
Pnorm(x, mean, sd, lower.tail (T/F), log.p(F/T)) – gives probability
of –∞ < x < z score (everything less than the x value).
ALT: pnorm (z-score) Default: mean=0,sd=1
Example for curve depicted in subsection 7.1: pnorm(x of red line,mean=0,sd=1)
= Total Probability left of red line (everything under curve left of red line)
Working backward from percentage probability to z score?
Use qnorm(probability,mean=0,sd=1)
= z score
7.3 Some Applications
Example 7.6
Amount of cosmic radiation to which a person is exposed while flying by jet
across the United States is a random variable having a normal distribution with
µ = 4.35mrem and σ = 0.59mrem. Find the probabilities that a person on such as flight
will be exposed to (a) more than
5.00mrem of cosmic radiation, (b)
anywhere from 3.00 to 4.00mrem of cosmic radiation.
(a) µ = 4.35mrem σ = 0.59mrem x
= 5 z = x - µ / σ = (5.00 – 4.35)/0.59 =
1.101695
Using R: pnorm(1.101695) = 0.8647029 à 0.8647029-0.5 = 0.3647029 (right
portion curve past mean and under the curve)
Since we are looking for values more than 5.00mrem à 0.5 – 0.3647029 = 0.1352971
Alternative (faster)
solution: 1-pnorm(1.101695) = 0.1352972 ~ 0.1352971
(b) µ = 4.35mrem σ = 0.59mrem x
= 3.00,4.00 z = x - µ / σ
z3.00 = (3.00-4.35)/0.59 = -2.288136 z4.00 = (4.00-4.35)/0.59 =
-0.5932203
Using R: pnorm(-2.288136) = 0.01106481 pnorm(-0.5932203) =
0.2765169
Pnorm(-0.5932203) – pnorm(-2.288136) = 0.2765169 –
0.01106481 = 0.2654521
* Remember that the
pnorm function in R uses –∞
as a base limit (calculates from left area).
Example 7.7
Actual amount of instant coffee that a machine puts into “4oz” jars varies from
jar to jar and it may be looked upon as a random variable having a normal
distribution with σ = 0.04oz. If only 2% of the jars are to contain less than 4oz of
coffee, what must be the mean fill of these jar?
σ = 0.04oz pnorm(z) =0.02 z = x - µ / σ
Using R: To work backwards from the 2% to the z score à
qnorm(0.02) = -2.053749
-2.053749 = 4.0 - µ/0.04 à µ = (-2.053749*0.04)-4 = -µ à -4.08215 =
-µ à µ =
4.08215
Normal distribution
is a continuous distribution that
applies to continuous random variables,
it can be approximated to find distributions of a finite number of values.
Normal distribution ~ finite #s distribution when continuity correction applied.
*** Continuity correction is NOT used as much in the present due to the power of computers.
Continuity correction
= The adjustment made when working with the normal distribution as an
approximation to the binomial distribution. Adjust statements for rounding.
< (less than, under, below, fewer than) – refers to 0.5
less than the target number (<15 à
<14.5)
≤ (at most, maximum, bottom) – refers to 0.5 more than
target (≤15 à ≤
15.5)
> (more than, above, over, greater than) – refers to 0.5
more than target (>15 à
>15.5)
≥ (at least, minimum, top) – refers to 0.5 less than the
target (≥15 à
≥14.5)
= (equal to, exactly, half) – refers to range of 0.5±target
(=15 à
14.5 < x < 15.5)
Example 7.8 Study
of aggressive behavior, male white mice, returned to the group in which they
live after 4 weeks of isolation, averaged 18.6 fights in 1st 5
minutes with standard deviation of 3.3 fights. If assumed that the distribution
of this random variable (# of fights under stated conditions) can be
approximated closely with normal distribution, what is the probability that
such a mouse will get into at least 15 fights in the first 5 minutes?
µ = 18.6 fights σ = 3.3
fights x = 15 à at least
15 à x ≥
14.5 z = (x - µ)/σ
z = (14.5 – 18.6)/3.3 = -1.242424
Using R: pnorm(-1.242424) = 0.1070401 à at least 15 fights so looking for right
half of curve (0.5) + left 1.242424 standard deviations à 1 –
0.1070401 = 0.8929599 ~ 0.89
*When value of a random variable is observed to have a normal
distribution = sampling a normal
population.
7.4 Normal Approximation to the Binomial Distribution
Normal distribution ~ binomial distribution WHEN:
- n (# of trials) is large (np > 5) and (n(1-p)>5)
- p (probability of
success) is ~ 0.5 (½)
Normal distribution w/ µ = np and σ = (np(1-p))½ ~ binomial distributions (b/c same form as ↑) can be used to approximate, even when n is fairly small and p differs from 0.5.
Normal
curve approximation to the binomial distribution is useful in problems where we would have
to use the formula for the binomial distribution repeatedly to obtain the
values of many different terms.
Example 7.11 What is the probability that at least 26 to 50 mosquitos will be killed by a new insect spray when the probability is 0.60 that any one of them will be killed by the spray?
Using the binomial distribution calculation:
Using R: sum(dbinom(26:50,50,0.6)) = 0.9021926 ≈ 0.902
Normal curve approximation b/c (np> 5), (n(1-p)>5); (50*0.6) = 30>5, (50*0.4) = 20>5
At least 26 à continuity correction à 25.5
Example 7.11 What is the probability that at least 26 to 50 mosquitos will be killed by a new insect spray when the probability is 0.60 that any one of them will be killed by the spray?
Using the binomial distribution calculation:
Using R: sum(dbinom(26:50,50,0.6)) = 0.9021926 ≈ 0.902
Normal curve approximation b/c (np> 5), (n(1-p)>5); (50*0.6) = 30>5, (50*0.4) = 20>5
At least 26 à continuity correction à 25.5
µ = np = (50)(0.6) = 30 σ =
(np(1-p))½ = (30*(0.4))½ = 3.46102
Z = x - µ / σ = (25.5 – 30)/3.46102 = -1.300195
Using R: pnorm(-1.300195) = 0.09676712 (This is the probability below the curve left of the 25.5 value)
To get the probabilities for values greater than 25.5 à 1-0.09676712 = 0.9032329
We now have 4 ways of determining binomial probabilities:
Using R: pnorm(-1.300195) = 0.09676712 (This is the probability below the curve left of the 25.5 value)
To get the probabilities for values greater than 25.5 à 1-0.09676712 = 0.9032329
We now have 4 ways of determining binomial probabilities:
1.
Computer programs or printouts
2.
Formula for binomial distribution
3.
Poisson approximation of binomial distribution
Normal approximation of binomial distribution
No comments:
Post a Comment