
Understanding Normal Distribution and Central Limit Theorem
Delve into the concepts of normal distribution, including the Central Limit Theorem (CLT), standard normal distribution, nonstandard normal distribution, origins of the normal distribution, normal approximation to binomial, and examples illustrating these principles with real-world scenarios.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
MAT 2572 Probability w/Statistics, Halleck Day 15/18 slides: 4.3 The Normal Distribution, including the Central Limit Theorem (CLT)
Standard normal distribution 1 2p? ?2for the STANDARD normal Day 9: introduced the PDF fZ(z)= distribution (standard has mean 0 and SD 1). Clearly fZ(z) 0 To show that the area under the curve fZ(z)=1, take the product of 2 copies of the integral and integrate as a double integral, transforming to polar coordinates. The details can be found in the video @27 min. video
The nonstandard normal Y=Z+ 1 ??? ? ? ? Recall: if W=aX+b then, ??? = [solve for X and substitute into pdf for x. However, when doing so, we horizontally stretch by a, which increases the area under curve also by a. To restore the area to 1, we have to simultaneously, vertically shrink by 1/a.] 2p? ? 2 1 2 2 fY(y)=
Origins of the normal distribution Like Poisson, it was originally discovered as an approximation to the binomial, this time with large n and p not too large or small (say .1<p<.9, although the range of allowed values of p grows as n grows).
Example of normal approx. to binomial There are 1000 students at City Tech registered Independent and 33% are expected to vote in the upcoming election. What is the chance that less than 30% of them will actually vote? p=.33, =np=1,000(0.33)=330 and = (npq)= (1,000*0.33*0.67)=14.9 Let X represent # Ind. students who vote. P(X < 300)=P(- < X < 300) = P(- < X 330 < 300 - 330) = P(- <X - 330 < -30) = P(- <(X - 330 )/14.9 < -30/14.9) = P( < Z < -2.01) =2.2% Hence, there is a 2.2% chance less than 30% will vote.
Continuity Correction There is a refinement that adds to the accuracy of the approximation and is useful especially if n is not large. This is equivalent to the midpoint rule. Note the error in the picture. The curve should go thru the midpoints of the top of each rectangle.
Example 4.3.1: Example 4.3.1: Overbooking on Airlines An aircraft has 168 seats. On average only 90% of all ticket holders on flights actually show up. The airline sells 178 tickets. What is chance that not everyone who arrives at the gate can be accommodated? Since our n=178 is not very large, we will use the continuity correction. p=.9, =np=178(0.9)=160.2 and = (npq)= (178*0.9*0.1)=4.00 Let X represent #of ticket holders who show up. P(X 169)=P(X 168.5)=P(168.5 X < ) = P(168.5-160.2 X-160.2 < ) = P(8.3 X-160.2 < ) = P(8.3/4 (X-160.2)/4 < ) = P(2.07 Z < ) =1.9% We get 1.3% when using binomial distribution directly: =1-BINOM.DIST(168,178,0.9,TRUE). The discrepancy can be ascribed to the somewhat low n and high p.
Going backwards: from probability to z-value to x-value A sell-out crowd of 45,000 is expected at Citi Field. The ballpark s concession manager is trying to decide how much food to have on hand. She knows that, on average, 38% of all those in attendance will buy a hot dog. How large an order should she place if she wants to have no more that a 20% chance of demand exceeding supply? =np=45k(.38)=17100 and = (npq)= (45k*0.38*0.62)=103 The z-value for 20% is .841. Now we destandardize: x= z+ =103*.841+17100=17,187
Central Limit Theorem Let W1, W2, . . .be an infinite sequence of independent random variables, each with the same pdf fW(w) with mean and variance 2, both finite. For any numbers a and b, or equivalently, We will call ? = W1 + W2+ + Wn. ?is known as the Sample Mean and / n is known as the Standard Error.
Exercise 4.3.16 Suppose that 100 fair dice are tossed. Estimate the probability that the sum of the faces showing exceeds 370. Include a continuity correction in your analysis. For one die, = 3.5 and = [(12+ 22+ 32+ 42+ 52+ 62 12+)/6-3.52]=1.71 This a ? problem. Hence, we calculate = n=1.71* 100=17.1. Also = n =100*3.5=350. ? 350 17.1 370.5 350 17.1 P ? > 370 = P ? 371 = ? = ? ? 1.20 = 11.5%
Example 4.3.5 Breath analyzer measurements are normal with = person s true alcohol concentration and = 0.004%. A driver is stopped at a roadblock. He has a true alcohol concentration of 0.075%, just under the legal limit of 0.08%. If he takes a breath analyzer, what are the chances that he will incorrectly get a DWI? z=(0.08-0.075)/.004=1.25 and P(Z>1.25)=10.56% So the driver has ~ 11% chance of getting the DWI.
Exercise 4.3.2: norming scores The aptitude test for a position shows a gender bias: Scores for men are normally distributed with = 62.0 and = 7.6, while scores for women are normally distributed with = 76.3 and = 10.8. Laura and Michael are the two candidates vying for the position: Laura has scored 92 on the test and Michael 75. If the company decides to norm the scores, whom should they hire? zM = (75-62)/7.6 =1.71 and zL=(92-76.3)/10.8=1.45 Michael s normed score is higher than Laura s, so Michael should be hired.