Binary Choice Models in Microeconometric Modeling

1 / 62

Embed Share

Explore binary choice models in microeconometric modeling through concepts like random utility, maximum likelihood, probit, logit, and more. Understand the central proposition of utility-based approaches in revealing underlying preferences. Dive into modeling binary choices between alternatives based on net utility calculations and observed data, such as visiting the doctor.

aleyiawi Follow

Uploaded on Apr 16, 2025 | 3 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

1/93: Topic 2.1 Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA 2.1 Binary Choice Models

2/93: Topic 2.1 Binary Choice Models Concepts Models Random Utility Maximum Likelihood Parametric Model Partial Effect Average Partial Effect Odds Ratio Linear Probability Model Cluster Correction Pseudo R squared Likelihood Ratio, Wald, LM Decomposition of Effect Exclusion Restrictions Incoherent Model Nonparametric Regression Klein and Spady Model Probit Logit Bivariate Probit Recursive Bivariate Probit Multivariate Probit Sample Selection Panel Probit

3/93: Topic 2.1 Binary Choice Models Central Proposition A Utility Based Approach Observed outcomes partially reveal underlying preferences There exists an underlying preference scale defined over alternatives, U*(choices) Revelation of preferences between two choices labeled 0 and 1 reveals the ranking of the underlying utility U*(choice 1) > U*(choice 0) Choose 1 U*(choice 1) < U*(choice 0) Choose 0 Net utility = U = U*(choice 1) - U*(choice 0). U > 0 => choice 1

4/93: Topic 2.1 Binary Choice Models Binary Outcome: Visit Doctor In the 1984 year of the GSOEP, 1611 of 3874 individuals visited the doctor at least once.

5/93: Topic 2.1 Binary Choice Models A Random Utility Model for the Binary Choice Yes or No decision | Visit or not visit the doctor Model: Net utility of visit at least once Net utility depends on observables and unobservables Random Utility Udoctor = Net utility = U*visit U*not visit Udoctor = + 1Age + 2Income + 3Sex + Choose to visit at least once if net utility is positive Observed Data: X = Age, Income, Sex y = 1 if choose visit, Udoctor > 0, 0 if not.

6/93: Topic 2.1 Binary Choice Models Modeling the Binary Choice Between the Two Alternatives Net Utility Udoctor = U*visit U*notvisit Udoctor = + 1 Age + 2 Income + 3 Sex + Chooses to visit: Udoctor > 0 + 1 Age + 2 Income + 3 Sex + > 0 Choosing to visit is a random outcome because of > -( + 1 Age + 2 Income + 3 Sex)

7/93: Topic 2.1 Binary Choice Models Probability Model for Choice Between Two Alternatives People with the same (Age,Income,Sex) will make different choices between is random. We can model the probability that the random event visits the doctor will occur. Probability is governed by , the random part of the utility function. Event DOCTOR=1 occurs if > -( + 1Age + 2Income + 3Sex) We model the probability of this event.

8/93: Topic 2.1 Binary Choice Models An Application 27,326 Observations in GSOEP Sample 1 to 7 years, panel 7,293 households observed We use the 1994 year; 3,337 household observations

9/93: Topic 2.1 Binary Choice Models An Econometric Model Choose to visit iff Udoctor > 0 Udoctor = + 1 Age + 2 Income + 3 Sex + Udoctor > 0 > -( + 1 Age + 2 Income + 3 Sex) < + 1 Age + 2 Income + 3 Sex) Probability model: For any person observed by the analyst, Prob(doctor=1) = Prob( < + 1 Age + 2 Income + 3 Sex) Note the relationship between the unobserved and the observed outcome DOCTOR.

10/93: Topic 2.1 Binary Choice Models Index = + 1Age + 2 Income + 3 Sex Probability = a function of the Index. P(Doctor = 1) = f(Index) Internally consistent probabilities: (1) (Coherence) 0 < Probability < 1 (2) (Monotonicity) Probability increases with Index.

11/93: Topic 2.1 Binary Choice Models A Fully Parametric Model Index Function: U = x + Observation Mechanism: y = 1[U > 0] Distribution: ~ f( ); Normal, Logistic, Maximum Likelihood Estimation: Max( ) logL = i log Prob(Yi = yi|xi) We will focus on parametric models We examine the linear probability model in passing.

12/93: Topic 2.1 Binary Choice Models Parametric Model Estimation How to estimate , 1, 2, 3? The technique of maximum likelihood Prob[ y L = = = = x x 0| ] Prob[ 1| ] y y = 0 1 y Prob[doctor=1] = Prob[ > -( + 1 Age + 2 Income + 3 Sex)] Prob[doctor=0] = 1 Prob[doctor=1] Requires a model for the probability

13/93: Topic 2.1 Binary Choice Models Completing the Model: F( ) The distribution Normal: PROBIT, natural for behavior Logistic: LOGIT, allows thicker tails Gompertz: EXTREME VALUE, asymmetric Others Does it matter? Yes, large difference in estimates Not much, quantities of interest are more stable.

14/93: Topic 2.1 Binary Choice Models Estimated Binary Choice Models for Three Distributions Log-L(0) = log likelihood for a model that has only a constant term. Ignore the t ratios for now.

15/93: Topic 2.1 Binary Choice Models Partial Effects in Probability Models Prob[Outcome] = some F( + 1Income ) Partial effect = F( + 1Income ) / x (derivative) Partial effects are derivatives Result varies with model Logit: F( + 1Income ) / x = Prob * (1-Prob) Probit: F( + 1Income )/ x = Normal density Extreme Value: F( + 1Income )/ x = Prob * (-log Prob) Scaling usually erases model differences

16/93: Topic 2.1 Binary Choice Models Partial effect for the logit model exp + Age+ Income+ Sex 1+exp + Age+ Income+ Sex + Age+ Income+ Sex) ) x ( ) Prob(doctor=1) 1 2 3 = ( ) 1 2 3 = ( = ( The derivative with respect to one of the variables is ) ) x 1 2 3 1 1 x ( 1 1 = x ) ( 1 ( k kx (1) A multiple of the coefficient, not the coefficient itself (2) A function of all of the coefficien (3) Evaluated using the data and model parts after the model is estimated. Similar computations apply for other models such as probit. ts and variables

17/93: Topic 2.1 Binary Choice Models Estimated Partial Effects for Three Models (Standard errors to be considered later)

18/93: Topic 2.1 Binary Choice Models Partial Effect for a Dummy Variable Computed Using Means of Other Variables Prob[yi = 1|xi,di] = F( xi+ di) where d is a dummy variable such as Sex in our doctor model. For the probit model, Prob[yi = 1|xi,di] = ( x+ d), = the normal CDF. Partial effect of d Prob[yi = 1|xi, di=1] - Prob[yi = 1|xi, di=0] = ( ) ( ) ( ) = + id x x

19/93: Topic 2.1 Binary Choice Models Partial Effect Dummy Variable

20/93: Topic 2.1 Binary Choice Models Computing Partial Effects Compute at the data means (PEA) Simple Inference is well defined. Not realistic for some variables, such as Sex Average the individual effects (APE) More appropriate Asymptotic standard errors are slightly more complicated.

21/93: Topic 2.1 Binary Choice Models Partial Effects = F( ' Probability x = P ) i i F( ' x P x ) = = f( ' Partial Effect x d = ) = i i i i x i i ( ) ) ( ) = f( ' ) = Partial Effect at the Means n i 1 = x x f ' 1 n i ( = f( ' Average Partial Effect n i 1 = n i 1 = d x = ) 1 n 1 n i i Both ar e estimates of =E[d ] under certain assumptions. i

22/93: Topic 2.1 Binary Choice Models The two approaches often give similar answers, though sometimes the results differ substantially. Average Partial Effects Partial Effects at Data Means

23/93: Topic 2.1 Binary Choice Models APE vs. Partial Effects at the Mean Delta Method for Average Partial Effect 1 Estimator of Var N N = G Var G PartialEffect i = 1 i

24/93: Topic 2.1 Binary Choice Models

25/93: Topic 2.1 Binary Choice Models

26/93: Topic 2.1 Binary Choice Models

27/93: Topic 2.1 Binary Choice Models How Well Does the Model Fit the Data? There is no R squared for a probability model. Least squares for linear models is computed to maximize R2 There are no residuals or sums of squares in a binary choice model The model is not computed to optimize the fit of the model to the data How can we measure the fit of the model to the data? Fit measures computed from the log likelihood Pseudo R squared = 1 logL/logL0 Also called the likelihood ratio index Direct assessment of the effectiveness of the model at predicting the outcome

28/93: Topic 2.1 Binary Choice Models Pseudo R2 = Likelihood Ratio Index log for the model L 2 Pseudo R = 1 - log for a model with only a constant term L ( ) = x The prediction of the model is F = F = Estimated Prob(y 1| ) x i i i Using only the constant term, F( ) N log[1 F( )] + LogL = (1 ) + logF( ) y y 0 i i = 1 i logF( ) < 0 = The log likelihood for the model is larger, but also < 0. log LRI = 1 - . Since logL > logL 0 log L log[1 F( )] N N 0 1 L LRI < 1. 0 0

29/93: Topic 2.1 Binary Choice Models Fit Measures Based on Predictions Computation Use the model to compute predicted probabilities P = F(a + b1Age + b2Income + b3Female+ ) Use a rule to compute predicted y = 0 or 1 Predict y=1 if P is large enough Generally use 0.5 for large (more likely than not) y 1 if P > P* = Fit measure compares predictions to actuals Count successes and failures

30/93: Topic 2.1 Binary Choice Models Cramer Fit Measure F = Predicted Probability F N Mean F | when =1 - Mean F | when = 0 y = N i = N i = (1 N )F y y = 1 1 i i 1 0 ( reward for correct predictions minus penalty for incorrect predictions ) ( ) y = +----------------------------------------+ | Fit Measures Based on Model Predictions| | Efron = .04825| | Ben Akiva and Lerman = .57139| | Veall and Zimmerman = .08365| | Cramer = .04771| +----------------------------------------+

31/93: Topic 2.1 Binary Choice Models Hypothesis Tests We consider nested models and parametric tests Test statistics based on the usual 3 strategies Wald statistics: Use the unrestricted model Likelihood ratio statistics: Based on comparing the two models Lagrange multiplier: Based on the restricted model. Test statistics require the log likelihood and/or the first and second derivatives of logL

32/93: Topic 2.1 Binary Choice Models Computing test statistics requires the log likelihood and/or standard errors based on the Hessian of LogL = Logit: g = y - ( 2 i i q y = H = q x ) E[H ] = = exp( )/[1+exp( )]) i z + ( ). Note, g is a "generalized residual." i z (1- = (1- ) i i i i i i i i i i 1, . z z i i i i i 2 2 i q z i i i i i Probit: g = H = , E[H ] = = i i i i (1 ) i i i i i = = ( ), i z i i i 2 i Estimators: Based on H , E[H ] and g all functions evaluated at z i i i 1 N x x Actual Hessian: Est.Asy.Var[ ] = H i i i = 1 i 1 N x x Expected Hessi an: Est.Asy.Var[ ] = i i i = 1 i 1 N 2 i x x BHHH: Est.Asy.Var[ ] = g i i = 1 i

33/93: Topic 2.1 Binary Choice Models Robust Covariance Matrix (Robust to the model specification? Latent heterogeneity? Correlation across observations? Not always clear) V A B A "Robust" Covariance Matrix: = = negative inverse of second derivatives matrix A 1 1 2 2 logProb log L N = i = estimated E - = 1 i B = matrix sum of outer products of first derivatives 1 logProb logProb log log L L N = i i = estimated E = 1 i 1 N A x x For a logit model, = (1 ) P P i i i i = 1 i N N i 2 2 i = B x x x x = ( ) y P e i i i i i = = 1 1 i i (Resembles the White estimator in the linear model case.)

34/93: Topic 2.1 Binary Choice Models Robust Covariance Matrix for Logit Model Doesn t change much. The model is well specified. --------+-------------------------------------------------------------------- | Standard Prob. 95% Confidence DOCTOR| Coefficient Error z |z|>Z* Interval --------+-------------------------------------------------------------------- Conventional Standard Errors Constant| 1.86428*** .67793 2.75 .0060 .53557 3.19299 AGE| -.10209*** .03056 -3.34 .0008 -.16199 -.04219 AGE^2.0| .00154*** .00034 4.56 .0000 .00088 .00220 INCOME| .51206 .74600 .69 .4925 -.95008 1.97420 |Interaction AGE*INCOME _ntrct02| -.01843 .01691 -1.09 .2756 -.05157 .01470 FEMALE| .65366*** .07588 8.61 .0000 .50494 .80237 --------+-------------------------------------------------------------------- Robust Standard Errors Constant| 1.86428*** .68518 2.72 .0065 .52135 3.20721 AGE| -.10209*** .03118 -3.27 .0011 -.16321 -.04098 AGE^2.0| .00154*** .00035 4.44 .0000 .00086 .00222 INCOME| .51206 .75171 .68 .4958 -.96127 1.98539 |Interaction AGE*INCOME _ntrct02| -.01843 .01705 -1.08 .2796 -.05185 .01498 FEMALE| .65366*** .07594 8.61 .0000 .50483 .80249

35/93: Topic 2.1 Binary Choice Models Base Model for Hypothesis Tests ---------------------------------------------------------------------- Binary Logit Model for Binary Choice Dependent variable DOCTOR Log likelihood function -2085.92452 Restricted log likelihood -2169.26982 Chi squared [ 5 d.f.] 166.69058 Significance level .00000 McFadden Pseudo R-squared .0384209 Estimation based on N = 3377, K = 6 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.23892 4183.84905 --------+------------------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X --------+------------------------------------------------------------- |Characteristics in numerator of Prob[Y = 1] Constant| 1.86428*** .67793 2.750 .0060 AGE| -.10209*** .03056 -3.341 .0008 42.6266 AGESQ| .00154*** .00034 4.556 .0000 1951.22 INCOME| .51206 .74600 .686 .4925 .44476 AGE_INC| -.01843 .01691 -1.090 .2756 19.0288 FEMALE| .65366*** .07588 8.615 .0000 .46343 --------+------------------------------------------------------------- H0: Age is not a significant determinant of Prob(Doctor = 1) H0: 2 = 3 = 5 = 0

36/93: Topic 2.1 Binary Choice Models Likelihood Ratio Test Null hypothesis restricts the parameter vector Alternative relaxes the restriction Test statistic: Chi-squared = 2 (LogL|Unrestricted model LogL|Restrictions) > 0 Degrees of freedom = number of restrictions

37/93: Topic 2.1 Binary Choice Models LR Test of H0: 2 = 3 = 5 = 0 UNRESTRICTED MODEL Binary Logit Model for Binary Choice Dependent variable DOCTOR Log likelihood function -2085.92452 Restricted log likelihood -2169.26982 Chi squared [ 5 d.f.] 166.69058 Significance level .00000 McFadden Pseudo R-squared .0384209 Estimation based on N = 3377, K = 6 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.23892 4183.84905 RESTRICTED MODEL Binary Logit Model for Binary Choice Dependent variable DOCTOR Log likelihood function -2124.06568 Restricted log likelihood -2169.26982 Chi squared [ 2 d.f.] 90.40827 Significance level .00000 McFadden Pseudo R-squared .0208384 Estimation based on N = 3377, K = 3 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.25974 4254.13136 Chi squared[3] = 2[-2085.92452 - (-2124.06568)] = 77.46456

38/93: Topic 2.1 Binary Choice Models Wald Test of H0: 2 = 3 = 5 = 0 Unrestricted parameter vector is estimated Discrepancy: q= Rb m is computed (or r(b,m) if nonlinear) Variance of discrepancy is estimated: Var[q] = R V R Wald Statistic is q [Var(q)]-1q = q [RVR ]-1q

39/93: Topic 2.1 Binary Choice Models Lagrange Multiplier Test of H0: 2 = 3 = 5 = 0 Restricted model is estimated Derivatives of unrestricted model and variances of derivatives are computed at restricted estimates Wald test of whether derivatives are zero tests the restrictions Usually hard to compute difficult to program the derivatives and their variances.

40/93: Topic 2.1 Binary Choice Models LM Test for a Logit Model Compute b0 (subject to restictions) (e.g., with zeros in appropriate positions. Compute Pi(b0) for each observation. Compute ei(b0) = [yi Pi(b0)] Compute gi(b0) = xiei using full xi vector LM = [ igi(b0)] [ igi(b0)gi(b0) ]-1[ igi(b0)]

41/93: Topic 2.1 Binary Choice Models

42/93: Topic 2.1 Binary Choice Models Application: Health Care Usage German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Variables in the file are Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=1079, 3=825, 4=926, 5=1051, 6=1000, 7=887). Note, the variable NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of the data for the person. DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT = health satisfaction, coded 0 (low) - 10 (high) DOCVIS = number of doctor visits in last three months HOSPVIS = number of hospital visits in last calendar year PUBLIC = insured in public health insurance = 1; otherwise = 0 ADDON = insured by add-on insurance = 1; otherswise = 0 HHNINC = household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years MARRIED = marital status EDUC = years of education

43/93: Topic 2.1 Binary Choice Models The Bivariate Probit Model 2 y * = y * = + , y =1(y *>0) + ,y =1(y *>0) x x 1 1 1 1 1 1 2 2 2 2 2 0 1 x The variables in different. There is no need for each equation to have its 'own vari able.' (The equations can be fit one at a time. Use FIML for (1) efficiency and (2) to get the estimate of .) 0 1 ~N , 1 and 2 may be the same or x 2 2

44/93: Topic 2.1 Binary Choice Models ML Estimation of the Bivariate Probit Model 2 (2y -1) (2y -1) (2y -1)(2y -1) , x x i1 1 i1 n logL = log , 2 i2 i2 i=1 i1 i2 n 2 = log q i1 1 x ,q x ,q q 2 i1 i2 i2 i1 i2 i=1 Note:q =(2y -1)=-1 if y = 0 and +1 if y = 1. =Bivariate normal CDF - must b using quadrature Maximized with respect to , i1 i1 i1 i1 e computed 2 and . 1 2

45/93: Topic 2.1 Binary Choice Models Application to Health Care Data x1=one,age,female,educ,married,working x2=one,age,female,hhninc,hhkids BivariateProbit ; lhs=doctor,hospital ; rh1=x1 ; rh2=x2;marginal effects $

46/93: Topic 2.1 Binary Choice Models Parameter Estimates ---------------------------------------------------------------------- FIML Estimates of Bivariate Probit Model Dependent variable DOCTOR HOSPITAL Log likelihood function -25323.63074 Estimation based on N = 27326, K = 12 --------+------------------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X --------+------------------------------------------------------------- |Index equation for DOCTOR Constant| -.20664*** .05832 -3.543 .0004 AGE| .01402*** .00074 18.948 .0000 43.5257 FEMALE| .32453*** .01733 18.722 .0000 .47877 EDUC| -.01438*** .00342 -4.209 .0000 11.3206 MARRIED| .00224 .01856 .121 .9040 .75862 WORKING| -.08356*** .01891 -4.419 .0000 .67705 |Index equation for HOSPITAL Constant| -1.62738*** .05430 -29.972 .0000 AGE| .00509*** .00100 5.075 .0000 43.5257 FEMALE| .12143*** .02153 5.641 .0000 .47877 HHNINC| -.03147 .05452 -.577 .5638 .35208 HHKIDS| -.00505 .02387 -.212 .8323 .40273 |Disturbance correlation RHO(1,2)| .29611*** .01393 21.253 .0000 --------+-------------------------------------------------------------

47/93: Topic 2.1 Binary Choice Models Marginal Effects What are the marginal effects Effect of what on what? Two equation model, what is the conditional mean? Possible margins? Derivatives of joint probability = 2( 1 xi1, 2 xi2, ) Partials of E[yij|xij] = ( j xij) (Univariate probability) Partials of E[yi1|xi1,xi2,yi2=1] = P(yi1,yi2=1)/Prob[yi2=1] Note marginal effects involve both sets of regressors. If there are common variables, there are two effects in the derivative that are added. (See Appendix for formulations.)

48/93: Topic 2.1 Binary Choice Models Marginal Effects: Decomposition

49/93: Topic 2.1 Binary Choice Models Direct Effects Derivatives of E[y1|x1,x2,y2=1] wrt x1 +-------------------------------------------+ | Partial derivatives of E[y1|y2=1] with | | respect to the vector of characteristics. | | They are computed at the means of the Xs. | | Effect shown is total of 4 parts above. | | Estimate of E[y1|y2=1] = .819898 | | Observations used for means are All Obs. | | These are the direct marginal effects. | +-------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ AGE .00382760 .00022088 17.329 .0000 43.5256898 FEMALE .08857260 .00519658 17.044 .0000 .47877479 EDUC -.00392413 .00093911 -4.179 .0000 11.3206310 MARRIED .00061108 .00506488 .121 .9040 .75861817 WORKING -.02280671 .00518908 -4.395 .0000 .67704750 HHNINC .000000 ......(Fixed Parameter)....... .35208362 HHKIDS .000000 ......(Fixed Parameter)....... .40273000

50/93: Topic 2.1 Binary Choice Models Indirect Effects Derivatives of E[y1|x1,x2,y2=1] wrt x2 +-------------------------------------------+ | Partial derivatives of E[y1|y2=1] with | | respect to the vector of characteristics. | | They are computed at the means of the Xs. | | Effect shown is total of 4 parts above. | | Estimate of E[y1|y2=1] = .819898 | | Observations used for means are All Obs. | | These are the indirect marginal effects. | +-------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ AGE -.00035034 .697563D-04 -5.022 .0000 43.5256898 FEMALE -.00835397 .00150062 -5.567 .0000 .47877479 EDUC .000000 ......(Fixed Parameter)....... 11.3206310 MARRIED .000000 ......(Fixed Parameter)....... .75861817 WORKING .000000 ......(Fixed Parameter)....... .67704750 HHNINC .00216510 .00374879 .578 .5636 .35208362 HHKIDS .00034768 .00164160 .212 .8323 .40273000

Binary Choice Models in Microeconometric Modeling

Download Presentation

Presentation Transcript

Related

More Related Content