Generalized Linear Models and Logistic Regression

generalized linear models n.w
1 / 20
Embed
Share

Generalized Linear Models (GLMs) are a class of linear models consisting of random, systematic, and link function components. The random component identifies the dependent variable and its probability distribution, while the systematic component involves explanatory variables. The link function connects the mean to the explanatory variables. Logistic Regression, a type of GLM, models the probability of a particular outcome. It is commonly used for dichotomous response variables and can be applied to various data distributions.

  • GLMs
  • Logistic Regression
  • Linear Models
  • Probability Modeling
  • Data Analysis

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Generalized Linear Models

  2. Generalized Linear Models (GLM) General class of linear models that are made up of 3 components: Random, Systematic, and Link Function Random component: Identifies dependent variable (Y) and its probability distribution Systematic Component: Identifies the set of explanatory variables (X1,...,Xp) Link Function: Identifies a function of the mean that is a linear function of the explanatory variables ( ) 0 1 g X = + + + ... X 1 p p

  3. Random Component Conditionally Normally distributed response with constant standard deviation - Regression models we have fit so far. Binary outcomes (Success or Failure)- Random component has Binomial distribution and model is called Logistic Regression. Count data (number of events in fixed area and/or length of time)- Random component has Poisson distribution and model is called Poisson Regression When Count data have V(Y) > E(Y), model fit can be Negative Binomial Regression Continuous data with skewed distribution and variation that increases with the mean can be modeled with a Gamma distribution

  4. Common Link Functions Identity link (form used in normal regression models): ( ) g = Log link (often used when cannot be negative as when data are Poisson or gamma): ( ) g ( ) = log Logit link (used when is bounded between 0 and 1 as when data are binary or proportions): ( ) = log g 1

  5. Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) Goal: Model the probability of a particular outcome as a function of the predictor variable(s) Problem: Probabilities are bounded between 0 and 1 Distribution of Responses: Binomial Data can be grouped or ungrouped: Grouped: Multiple observations observed at m distinct level(s) of the predictors Ungrouped: Each observation treated as an individual case (as when predictor(s) are continuous Link Function: ( ) = log g 1

  6. Logistic Regression with 1 Predictor Response - Presence/Absence of characteristic Predictor - Numeric variable observed for each case Model - (x) Probability of presence at predictor level x + x 1 e + 0 1 ( ) x = = ( ) + + x x 1 + e 1 e 0 1 0 1 = 0 P(Presence) is the same at each level of x > 0 P(Presence) increases as x increases 1< 0 P(Presence) decreases as x increases

  7. Model and Estimation ( ) ( ) Grouped Case: distinct levels: ,..., m with binomial data , ,..., , x x x y n y n 1 1 1 m m m + n y x e + 0 1 i ( ) ( ) p y ( ) ( ) n y i = = = = y ~ , 1 , 1,..., Y B n L i m i i i 0 1 i i i i i i i i + x 1 e 0 1 i i n y m m ( ) ( ) ( ) n y i = = y Likelihood Function: , , 1 L L i i i 0 1 0 1 i i i = = 1 1 i i i n y m ( ) ( ) ( ) ( ln 1 ) i = = + + log-likelihood: , ln , ln ln l L y n y 0 1 0 1 i i i i i = 1 i i ^ ^ + x e ^ ^ ^ ^ ^ y ^ i 0 1 ( ) = = Choose , that maximize , at value , Predicted Values: l l n n i 0 1 0 1 0 1 i i i ^ ^ + + x 1 e i 0 1 = = Ungrouped Case: Each observation (may) have it's own distinct level: =1; 1 or 0; x n y m n i i + x e + 0 1 i ( ) ( ) p y ( ) ( ) 1 y = = = = y ~ 1, 1 , 1,..., Y B L i n i i 0 1 i i i i i i i + x 1 e 0 1 i n n ( ) ( ) ( ) 1 y = = y Likelihood Function: , , 1 L L i i 0 1 0 1 i i i = = 1 1 i i n ( ) ( ) ( ) ( ln 1 ) = = + log-likelihood: , ln , ln 1 l L y y 0 1 0 1 i i i i = 1 i

  8. Model/Estimation Null and Saturated Cases Based on Grouped Data, clear adjustment for Ungrouped: Null Model (no association between and ): = = + + = + + 0 ... ... y x y y y n n n 1 1 1 m m n y e + 0 ( ) ( ) ( ) ( ) n y = = + = y ~ , 1 Y B n p y L 0 1 e 0 n y ( ) ( ) ( ) ( ln 1 ) = = + log-likelihood: ln l n ln l L y n y 0 0 y = ^ ^ ^ ( ) = Choose that maximizes at value Note: ln l l l 0 0 0 0 0 n y ^ e ^ y ^ ^ 0 = where = Predicted Value for : y n i i i ^ + 1 e 0 Saturated Model: Model has as many = + predictors as cells ( ): m n y y n n y m ( ) i + ln ln ln where 0ln(0) 0 i i i l y n y s i i i n = 1 i i i i ^ y = Predicted Value for : y y i i i

  9. Logistic Regression with 1 Predictor Wald Test areunknown parameters and must be estimated using statistical software such as SPSS, SAS, R or STATA (or in a matrix language) Primary interest in estimating and testing hypotheses regarding Large-Sample test (Wald Test): H0: = 0 HA: 0 2 ^ ^ = : 1 TS z Note: Some software packages perform this as an equivalent Z-test = 2 obs : 1 TS X obs ^ ^ ^ ^ SE SE 1 1 : RR z z 2 obs 2 : RR X /2 z obs P Z ;1 ( ) = 2 P ( ) = 2 1 2 obs P P X obs

  10. Odds Ratio Interpretation of Regression Coefficient ( ): In linear regression, the slope coefficient is the change in the mean response as x increases by 1 unit In logistic regression, we can show that: ( ) ( ) odds x ( ) ( ) + odds 1 x x ( ) x = = where odds e 1 1 x Thus e represents the change in the odds of the outcome (multiplicatively) by increasing x by 1 unit If = 0, the odds and probability are the same at all x levels (e =1) If > 0 , the odds and probability increase as x increases (e >1) If < 0 , the odds and probability decrease as x increases (e <1)

  11. 95% Confidence Interval for Odds Ratio Step 1: Construct a 95% CI for : ^ ^ ^ ^ ^ ^ ^ ^ ^ + 1.96 1.96 , 1.96 SE SE SE 1 1 1 1 1 1 Step 2: Raise e = 2.718 to the lower and upper bounds of the CI: ^ ^ ^ ^ ^ ^ 1.96 1.96 + SE SE 1 1 1 1 , e e If entire interval is above 1, conclude positive association If entire interval is below 1, conclude negative association If interval contains 1, cannot conclude there is an association

  12. Likelihood Ratio Test / Deviance The Wald Test is like the t-test in Linear Regression The Likelihood Ratio Test is similar to the F-tests in Linear Regression ( ) ( ) = + x x e + 0 1 ( ) x + = = ln Goal to Test: : 0 versus : 0 x H H 0 1 0 1 1 A + x 1 1 x e 0 1 ( ) ( ) ( ) ( ) x ^ = = Step 1: Fit the Null model: ln and obtain the log-likelihood l l 0 0 0 1 x x ^ ^ = + = Step 2: Fit the Full mode l: ln and obtain the log-likelihood , x l l 0 1 0 1 1 1 x ( ) = = 2 LR 2 LR 2 2 1 2 LR Test Statistic: 2 Rejection Region: P X l l X P X 0 1 ;1 Deviance of a Model is -2 times difference in th and the log-likelihood of the Saturated Model: Null Deviance: 2 D = e log-likelihood of a given model ( ) where , log-likelihoods for the null and saturated models. ( ) 2 #parameters Ungrouped Data: l l l l 0 0 ) Residual Deviance: m 0 s s = For a given model (sa Degrees of freedom: Grouped Data: y M D l l M M s #parameters n

  13. Residuals and Goodness-of-Fit Statistics ^ y y ^ y ^ = = = P i Pearson Residuals: 1,..., i i e i m n i i i ^ ^ 1 n i i i 2 ^ y y i i m m ( ) i e 2 = = 2 P P Pearson's Chi-Square Statistic: Degrees of Freedom #parameters X m ^ ^ = = 1 n 1 1 i i i i i y n y ^ y ( ) = + D i Deviance Res iduals: sgn 2 ln ln with 0ln(0) 0 i i i e y y n y i i i i i ^ y ^ y n i i i m ( ) = 2 2 D D i Deviance Chi-Square Statistic (Residual Deviance in R): Degrees of Fre edom #parameters X e m = 1 i Hosmer-Lemeshow Test for Ungrouped Data: 1) Group cases into derived groups based on their predicted probabilities (Often 10 groups are used) g ^ 2) Let the size of group , # of Successes in group , average predicted probability in group n i o i i i i i 2 ^ o n i i i g = 2 HL Degrees of Freedom 2 X g ^ ^ = 1 i 1 n i i i

  14. Overdispersion with Grouped Data When the Pearson Chi-square statistic is large relative to its degrees of freedom, there is evidence that the variance of the group counts Y is larger than expected under the binomial distribution Adjustments can be made for tests and confidence intervals for regression coefficients that are obtained from standard software packages D df D df 0 M 2 p X 2 * 2 ^ ^ ^ ^ ^ ^ = = = 0 M * ~ SE SE F F , #parameters df df m 2 #parameters m ^ 0 M

  15. Pseudo-R2 Statistics McFadden: Based on the log-likelihoods of the Current ( l R l ) and Null ( ) Models: l l 0 M # Parameters l l = 2 McF 1 Adjusted version: 1 M M 0 0 Cox & Snell: Based on Likelihood for Current ( ) and Null ( ) M odels: L L 0 M 2/ m L L = = l 2 CS 1 where where log-likelihood (typically printed by software packages) 0 R L e l * * * M Nagelkerke: Similar to Cox & Snell but can be as large as 1: 2/ m L L 1 0 = 2 N M R ( ) 2/ m 1 L 0 Efron: Similar to Least Squares Regression Method: 2 m m ^ y y y i i i = = 2 E = = 1 1 m 1 m i i R y ( ) 2 y y i = 1 i

  16. Multiple Logistic Regression Extension to more than one predictor variable (either numeric or dummy variables). With p predictors, the model is written: + + + ... 1 1 x p p x e + 0 = + + + ... 1 1 x p p x 1 e 0 Adjusted Odds ratio for raising xi by 1 unit, holding all other predictors constant: e = OR i i Many models have nominal/ordinal predictors, and widely make use of dummy variables

  17. Testing Regression Coefficients Testing the overall model: = = = = : ... 0 : Not all 0 H H 0 1 p A i ( ) ( ) = = 2 obs : 2 ln ln 2 TS X L L l l 0 1 = 1 o ( ) 2 obs 2 2 p 2 obs : RR X P P X ; p L0, L1 are values of the maximized likelihood function, computed by statistical software packages. This logic can also be used to compare full and reduced models based on subsets of predictors. Testing for individual terms is done as in model with a single predictor.

  18. Poisson Regression Generally used to model Count data Distribution: Poisson (Restriction: E(Y)=V(Y)) Link Function: Can be identity link, but typically use the log link: ( ) ( ) 0 ln g = = = + + + ... 1 1 x x p p + + + ... 1 1 x p p x e 0 Tests are conducted as in Logistic regression (based on a different likelihood function) When the mean and variance are not equal (over-dispersion), often replace the Poisson Distribution replaced with Negative Binomial Distribution

  19. Model and Estimation ( ) ( ) + + + ... x p ip x = + + + = ~ ln ... Y P x x e 0 1 1 i 0 1 1 i i i i p ip i n y y n n n 1 y e e ( ) i i i i ( ) p y ( ) i = = = = y ,... ,..., i i p y y e L = 1 i i 1 0 i n i p ! ! ! y y = = = 1 1 1 i i i i i i n n n ( ) ( ) ( ) ( ) = = + ,..., ln ,..., ln ln ! l L y y 0 0 p p i i i i = = = 1 1 1 i i i ( ) ^ ^ ^ ^ Choose ,. .., that maximize ,..., at ,..., l l 0 0 p p 0 p n n n ^ ^ ^ ^ ^ ^ ^ ^ ( ) + + + ... x x = + = ,..., ln ln ! with l y y e 1 i ip 0 1 p 0 p i i i i i = = = 1 1 1 i i i ( ) y n ^ ^ ( ) ( ) = = = = + Null Model: ln ln ln ln ! y l ny ny y y 0 i 0 0 i i = 1 i n n ^ ( ) y ( ) = = + Saturated Model: ln ln ! y l ny y y i i s i i i = = 1 1 i i

  20. Residuals and Goodness-of-Fit Statistics ^ y ^ ^ ^ ^ + + + X X = = = P i i Pearson Residuals: 1,..., i e i n e 1 i ip 0 1 p i ^ i 2 ^ y i i n n ( ) i e 2 = = 2 P P Pearson's Chi-Square Statistic: Degrees of Freedom #parameters X n ^ = = 1 1 i i i 1/2 y ^ ^ = D i Deviance Residuals: sgn 2 ln with 0ln(0) 0 i e y y y i i i i i ^ i n ( ) 2 = 2 D D i Deviance Chi-Square Statistic (Residual Deviance in R): Degrees of Freedom #parameters X e n = 1 i Tes 1) Group cases into derived groups based on their predicted values (Often 10 groups are used) g t for Ungrouped Data: ^ 2) Let the size of group , # of Outcomes in group , average pred icted value in group n i o i i i i i 2 ^ o i i g = 2 GOF Degrees of Freedom ' X g p ^ = 1 i i

More Related Content