Linear Regression and Correlation Explanations

chapter 11 n.w
1 / 24
Embed
Share

Explore the concepts of linear regression and correlation, understanding the relationship between explanatory and response variables in a linear model. Learn about least squares estimation, parameter calculations, and an example related to pharmacodynamics of LSD.

  • Regression
  • Correlation
  • Pharmacodynamics
  • LSD
  • Estimation

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Chapter 11 Linear Regression and Correlation

  2. Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of the explanatory variable assumed to be approximately linear (straight line) Model: = + + 2 e ~ , 0 ( N ) Y x 0 1 1 > 0 Positive Association 1 < 0 Negative Association 1 = 0 No Association

  3. Least Squares Estimation of 0, 1 0 Mean response when x=0 (y-intercept) 1 Change in mean response when x increases by 1 unit (slope) 0, 1 are unknown parameters (like ) 0+ 1x Mean response when explanatory variable takes on the value x Goal: Choose values (estimates) that minimize the sum of squared errors (SSE) of observed values to the straight-line: 2 2 ^ y ^ ^ ^ y ^ ^ n n = + = = + x SSE y y x i i i 0 1 0 1 i = = 1 1 i i

  4. Example - Pharmacodynamics of LSD Response (y) - Math score (mean among 5 volunteers) Predictor (x) - LSD tissue concentration (mean of 5 volunteers) Raw Data and scatterplot of Score vs LSD concentration: 80 70 60 Score (y) 78.93 58.20 67.47 37.47 45.65 32.92 29.97 LSD Conc (x) 1.17 2.97 3.26 4.69 5.83 6.00 6.41 50 40 30 SCORE 20 1 2 3 4 5 6 7 LSD_CONC Source: Wagner, et al (1968)

  5. Least Squares Computations Parameter Estimates Summary Calculations ( )( ) S x x y y ^ ( ( ( ) )( ) xy = = 2 ( ) = S x x 1 2 S xx x x ) xx = S x x y y xy ^ ^ = y x 2 = 0 1 S y y yy ( ) = 2 SSE S S S yy xy xx 2 n ^ y = i y i i SSE = = 2 e 1 s 2 2 n n

  6. Example - Pharmacodynamics of LSD Score (y) 78.93 58.20 67.47 37.47 45.65 32.92 29.97 350.61 LSD Conc (x) 1.17 2.97 3.26 4.69 5.83 6.00 6.41 30.33 x-xbar -3.163 -1.363 -1.073 0.357 1.497 1.667 2.077 -0.001 y-ybar 28.843 8.113 17.383 -12.617 -4.437 -17.167 -20.117 0.001 Sxx Sxy Syy 10.004569 1.857769 1.151329 0.127449 2.241009 2.778889 4.313929 22.474943 -91.230409 -11.058019 -18.651959 -4.504269 -6.642189 -28.617389 -41.783009 -202.487243 2078.183343 831.918649 65.820769 302.168689 159.188689 19.686969 294.705889 404.693689 (Column totals given in bottom row of table) 350 61 . 30 33 . = = = = 50 087 . . 4 333 y x 7 7 202 4872 . ^ ^ ^ = = . 9 = = . 9 = 01 50 09 . ( 01 )( . 4 33 ) 89 10 . y x 1 0 1 4749 . 22 ^ y = = 2 89 10 . . 9 01 50 72 . x es

  7. SPSS Output and Plot of Equation Coefficientsa Unstandardized Coefficients Standardized Coefficients Model 1 B Std. Error Beta t Sig. (Constant) LSD_CONC 89.124 -9.009 7.048 1.503 12.646 -5.994 .000 .002 -.937 a. Dependent Variable: SCORE Math Score vs LSD Concentration (SPSS) 80.00 Linear Regression 70.00 60.00 score 50.00 40.00 30.00 score = 89.12 + -9.01 * lsd_conc 3.00 R-Square = 0.88 1.00 2.00 4.00 5.00 6.00 lsd_conc

  8. Inference Concerning the Slope (1) Parameter: Slope in the population model( 1) Estimator: Least squares estimate: Estimated standard error: ^ 1 1= S 1 E s S e xx ^ Methods of making inference regarding population: Hypothesis tests (2-sided or 1-sided) Confidence Intervals

  9. Hypothesis Test for 1 1-sided Test H0: 1 = 0 HA+: 1 > 0 or HA-: 1 < 0 2-Sided Test H0: 1 = 0 HA: 1 0 ^ ^ = . : . S 1 T t = . : . S 1 T t obs 1 s S obs 1 s S e xx e t xx . | : . R | R t t + . . : . . : R R t R R t t , 2 / t 2 t obs n , 2 , 2 obs n obs n + value : 2 ( | |) P P : ( ) : ( ) P val P t t P val P t t obs obs obs

  10. (1-)100% Confidence Interval for 1 1 ^ ^ t SE t s / 2 / 2 e 1 1 ^ S 1 xx Conclude positive association if entire interval above 0 Conclude negative association if entire interval below 0 Cannot conclude an association if interval contains 0 Conclusion based on interval is same as 2-sided hypothesis test

  11. Example - Pharmacodynamics of LSD ^ = = . 9 = = = 7 01 50 72 . . 7 12 22 475 . n s S e xx 1 1 = . 1 = SE . 7 12 50 ^ 22 475 . 1 Testing H0: 1 = 0 vs HA: 1 0 . 9 01 = = . 6 = . : . S 01 . :| . R | . 2 571 T t R t 025 . t 5 , obs obs . 1 50 95% Confidence Interval for 1 : . 9 . 9 . 5 01 . 2 571 . 1 ( 50 ) 01 . 3 86 ( 12 87 . , 15 )

  12. Confidence Interval for Mean When x=x* Mean Response at a specific level x* is = = + ( | *) * E y x x 0 1 y Estimated Mean response and standard error (replacing unknown 0 and 1 with estimates): ( ) 2 1 * x x ^ ^ ^ = + = + * SE x s e 0 1 y ^ n S xx Confidence Interval for Mean Response: ( ) 2 1 * x x ^ ^ + SE^ t t s , 2 / 2 , 2 / 2 n n e y y n S xx

  13. Prediction Interval of Future Response @ x=x* Response at a specific level x* is = * y x + = + + * x 0 1 y Estimated response and standard error (replacing unknown 0 and 1 with estimates): ( ) 2 1 * x x ^ ^ ^ = + = + + * SE 1 y x s e 0 1 ^ n S y xx Prediction Interval for Future Response: ( ) 2 1 * x x ^ ^ + + SE^ 1 y t y t s , 2 / 2 , 2 / 2 n n e n S y xx

  14. Correlation Coefficient Measures the strength of the linear association between two variables Takes on the same sign as the slope estimate from the linear regression Not effected by linear transformations of y or x Does not distinguish between dependent and independent variable (e.g. height and weight) Population Parameter: yx Pearson s Correlation Coefficient: S xy = 1 1 r r yx S S xx yy

  15. Correlation Coefficient Values close to 1 in absolute value strong linear association, positive or negative from sign Values close to 0 imply little or no association If data contain outliers (are non-normal), Spearman s coefficient of correlation can be computed based on the ranks of the x and y values Test of H0: yx = 0 is equivalent to test of H0: 1=0 Coefficient of Determination (ryx2) - Proportion of variation in y explained by the regression on x: S S SSE ( Total ) ( Residual ) SS SS yy = = = 2 yx 2 2 ( ) 0 1 r r r yx ( Total ) SS yy

  16. Example - Pharmacodynamics of LSD = = 202 = = 22 475 . . 487 2078 183 . 253 89 . S S S SSE xx xy yy 202 487 . = = . 0 94 r yx ( 22 . 475 )( 2078 183 . ) 2078 183 . 253 . 89 = = = . 0 2 yx 2 . 0 88 ( 94 ) r 2078 183 . Syy SSE 80.00 Mean 80.00 Linear Regression 70.00 70.00 60.00 60.00 score score 50.00 50.00 Mean = 50.09 40.00 40.00 30.00 30.00 score = 89.12 + -9.01 * lsd_conc R-Square = 0.88 1.00 2.00 3.00 4.00 5.00 6.00 1.00 2.00 3.00 4.00 5.00 6.00 lsd_conc lsd_conc

  17. Example - SPSS Output Pearson s and Spearman s Measures C C o S r S r e _ l a P S N P S N S O C * la C S N C S N S S O _ C C * t e i e i C R O o * . t io o i g o i g C p R O o * . i o a g a g E r E N n r r r n r e . ( r r e . ( O e a r r e s o 2 s ( 2 R C e s la 2 - la 2 - R E r m C la L D C . . N O s ( n - n - E la o C i C i n o o r r s r ) r ) e e la la i t i t i n o o n n f t a t a t i le le d d i o s g i i c a n t a t t h e 0 . 0 1 le v e l ( 2 - t a i le d ) . C o S r S r D e C L r t i o t a t i o t a a t i o i le i le n n n ' s n C d C d r i s o o h s e e o f f i c f f i c i g n i e i e n n a t t n ) ) i f i c t a t t h e 0 . 0 1 le v e l ( 2 - t a i le d ) .

  18. Hypothesis Test for yx 1-sided Test H0: yx = 0 HA+: yx > 0 or HA-: yx < 0 2-Sided Test H0: yx = 0 HA: yx 0 2 n = . : . S T t r 2 n = obs yx . : . S T t r 2 yx 1 r obs yx 2 yx 1 r . | : . R | R t t + . . : . . : R R t t R R t t , 2 / t 2 t obs n , 2 , 2 obs n obs n + value : 2 ( | |) P P : ( ) : ( ) P val P t t P val P t t obs obs obs

  19. Analysis of Variance in Regression Goal: Partition the total variation in y into variation explained by x and random variation ^ y ^ y = + ( ) ( ) ( ) y y y y i i i i 2 2 ^ y ^ y 2 = + ( ) ( ) ( ) y y y y i i i i These three sums of squares and degrees of freedom are: Total (TSS) DFT = n-1 Error (SSE) DFE = n-2 Model (SSR) DFR = 1

  20. Analysis of Variance for Regression Source of Variation Model Error Total Sum of Squares SSR SSE TSS Degrees of Freedom 1 n-2 n-1 Mean Square MSR = SSR/1 MSE = SSE/(n-2) F F = MSR/MSE Analysis of Variance - F-test H0: 1 = 0 HA: 1 0 MSR = . : . S T F obs MSE . : . R R F F 1 , , 2 F obs n value : ( ) P P F obs

  21. Example - Pharmacodynamics of LSD Total Sum of squares: = = = = 2 ( ) 2078 183 . 7 1 6 TSS y y DF i T Error Sum of squares: ^ y = = = = 2 ( ) 253 890 . 7 2 5 SSE y DF i E i Model Sum of Squares: ^ y = = = = 2 ( ) 2078 183 . 253 890 . 1824 293 . 1 SSR y DF R i

  22. Example - Pharmacodynamics of LSD Source of Variation Model Error Total Sum of Squares 1824.293 253.890 2078.183 Degrees of Freedom 1 5 6 Mean Square 1824.293 50.778 F 35.93 Analysis of Variance - F-test H0: 1 = 0 HA: 1 0 MSR = = . : . S 35 93 . T F obs MSE = . : . val 61 . 6 R R F F 05 . 1 , , 5 obs : ( 35 93 . ) P P F

  23. Example - SPSS Output b A S S e N u m o q u a d f a n S F S O i g . P a D b . V r e q u a r e . e A f s M r e T o t a p e l d i c t o r s n t V : ( C o r i a n s t a b l e n t ) , L : S S O D R _ E C O N C n d e a C

  24. Linearity of Regression (SLR) -Test for Lack-of-Fit ( = observations at distinct levels of " ") c = + F n X j ( ) ( ) + : : H E Y X H E Y X 0 0 1 0 1 i i A i i i Compute fitted value and sample mean for each distinct level Y Y X j j ( ) n c j 2 ( ) = = Lack-of-Fit: 2 SS LF Y Y df c j j LF = = 1 1 j i n ( ) c j 2 ( ) = = n c Pure Error: SS PE Y Y df j ij PE = = 1 1 j i ( ( ( ) ) ( ( 2, ) ) ( ( ) ) 2 SS LF SS PE c n c H ( ( ) ) MS LF MS PE 0 ~ = = Test Statistic: F F n c 2, LOF c ) n c Reject H if 1 ; F F c 0 LOF

Related


More Related Content