
Nonlinear Models in Econometric Panel Data Analysis
Explore the concepts of nonlinear models in econometric panel data analysis, including estimation theory, model definition, parameter estimation, and characteristics of estimators. Delve into the complexities of nonlinear modeling and the implications for understanding economic phenomena.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Part 14: Nonlinear Models [1/80] Econometric Analysis of Panel Data William Greene Department of Economics University of South Florida
Part 14: Nonlinear Models [2/80] Nonlinear Models Nonlinear Models Estimation Theory for Nonlinear Models Estimators Properties M Estimation Nonlinear Least Squares Maximum Likelihood Estimation GMM Estimation Minimum Distance Estimation Minimum Chi-square Estimation Computation Nonlinear Optimization Nonlinear Least Squares Newton-like Algorithms; Gradient Methods (Background: JW, Chapters 12-14, Greene, Chapters 12-14, App. E)
Part 14: Nonlinear Models [3/80] What is a Model? Purely verbal description of a phenomenon Unconditional characteristics of a population Conditional moments: E[g(y)|x]: median, mean, variance, quantile, correlations, probabilities Conditional probabilities and densities Conditional means and regressions Fully parametric and semiparametric specifications Parametric specification: Known up to parameter Parameter spaces Conditional means: E[y|x] = m(x, )
Part 14: Nonlinear Models [4/80] What is a Nonlinear Model? Model: E[g(y)|x] = m(x, ) Objective: Learn about from y, X Usually estimate Linear Model: Closed form; = h(y, X) Nonlinear Model: Any model that is not linear Not wrt m(x, ). E.g., y=exp( x + ) Wrt estimator: Implicitly defined. h(y, X, )=0, E.g., E[y|x]= exp( x)
Part 14: Nonlinear Models [5/80] What is an Estimator? Point and Interval f(data|model) I( ) = = sampling variability Set estimation: some subset of RK Classical and Bayesian = = E[ |data,prior f( )] I( ) narrowest interval from posterior density containing the specified probability (mass) expectation from posterior =
Part 14: Nonlinear Models [6/80] Parameters Model parameters features of the population The true parameter(s) exp( y / Example: f(y | ) i x ) = = i , exp( x ) i i i i i Model parameters: Conditional Mean: E(y | Slopes of interest: = E [ E(y | = = x ) exp( ) / i x x x ) i i i i ] x i i
Part 14: Nonlinear Models [7/80] M Estimation Classical estimation method 1 n n = arg min q( data , ) i i=1 Example: Nonlinear Least squares 1 arg min n Example: Maximum Likelihood 1 arg min - n n 2 = [y -E(y | x , )] i i i i=1 n = lnf(y| , )] x i i i=1
Part 14: Nonlinear Models [8/80] An Analogy Principle for M Estimation 1 n n The estimator minimizes q= q( data , ) i = i 1 Assumed: (This is part of the model): The true parameter E.g., E[(y-m( , )) ] is minimized at The weak law of large numbe i i 1 n minimizes q*=E[q( data , )] 0 2 data . 0 rs: 1 n P q= q( data , ) q*=E[q( data , )] =
Part 14: Nonlinear Models [9/80] Estimation 1 n n P q= q( data , ) q*=E[q( data , )] i = i 1 Estimator minimizes q True parameter P minimizes q* q q* 0 P Does this imply Yes, if ... ? 0
Part 14: Nonlinear Models [10/80] (1) The Parameters Are Identified Uniqueness : If Examples when this does not occur (1) Multicollinearity generally (2) Need for normalization E[y| ] = m( (3) Indeterminacy m( , )= , then m( , ) x m( , x ) for some x 1 0 1 0 x w x x / ) + 1 + 3 x x hen = 0 4 2 3
Part 14: Nonlinear Models [11/80] (2) Continuity of the Criterion , ) is q( (a) Continuous in for all (b) Continuously differentiable. First derivatives are also continuous (c) Twice differentiable. Second derivatives must be nonzero, thoug be continuous functions of . (E.g. Linear LS) data i data and all i h they need not
Part 14: Nonlinear Models [12/80] Consistency 1 n n P q= q( data , ) q*=E[q( data , )] i = i 1 Estimator minimizes q True parameter q Does this imply Yes. Consistency follows from identification and conti nuity with the other assumptions minimizes q* q* 0 P P ? 0
Part 14: Nonlinear Models [13/80] Asymptotic Normality of M Estimators , ) N i=1 (1/n) q( data = First order conditions: 0 i , ) q( data 1 n 1 n N i=1 N i=1 = = = g data ( , ) g data ( , ) i i For any , this is the mean of a random sample. We apply Lindberg-Feller CLT to assert , ) the limit ing normal distribution of n g data ( 0 . 0 Implies limiting normal distribution of n ( Limiting mean is . Limiting variance to be obtained. Asymptotic distribution obtained by the usual means. (Proof sketched in Appendix.) ). 0
Part 14: Nonlinear Models [14/80] Estimating the Asymptotic Variance 1 1 = [ Asy.Var[ ] H ( )] Var[ ( g data , )] [ H ( )] 0 0 0 , ) 2 m( data 1 n n 1 ( Estimate [ H )] with i 0 = i 1 data m( data , ) m( data , ) 1 1 n n 1 1 n n n n = Estimate Var[ ( g data , )] with gg i i 0 i i = i 1 = i 1 i i n i=1 2 2 = E.g., if this is linear least squares, (1/2) (y - x ) m( , ) (1 / 2)(y xb ) i i i 1 , ) 2 m( data 1 n n 1 = ( X X /n) i = i 1 , ) , ) m( data m( da ta 1 1 n n n i 2 N 2 i = (1 /n ) e x x i i = i 1 i = i 1 The estimated asymptotic variance is the White estimator.
Part 14: Nonlinear Models [15/80] Nonlinear Least Squares Gauss-Marquardt Algorithm q the squared deviation from the mean = 1 2 i ( ) 2 2 i = = y - m( , ) m( , ) x 1 2 1 2 i i x 0 i = = = g - x *'pseudo regressors' i i i i i 2 m( , ) x i i 0 i 0 0 0 = ( = H x )( x ) ; E[ ] H x x i i i i i 0 i 0 x is one row of pseudo-regressor matrix X (k+1) (k) 0 0 1 0 0 = + Algorithm: [ X 'X ] X 'e
Part 14: Nonlinear Models [16/80] Application - Income German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Variables in the file are Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). HHNINC = household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years
Part 14: Nonlinear Models [17/80] Income Data 2.74 2.19 1.64 Density 1.09 .55 .00 .00 1.00 2.00 3.00 4.00 5.00 INCOME Kernel density estimate for INCOME
Part 14: Nonlinear Models [18/80] Exponential Model HHNINC 1 = f(Income| Age,Educ,Married) exp i i i = i + + + E[HHNINC| Age,Educ,Married] Starting values for the iterations: E[y |nothing else]=exp(a ) exp(a a Educ a Married a Age) 0 1 2 3 = i i 0 = = = Start a = ln(HHNINC), a a a 0 0 1 2 3
Part 14: Nonlinear Models [19/80] Conventional Variance Estimator ) ( i n 2 n , )] [y #parameters m( x 1 X X 0 0 = i 1 i
Part 14: Nonlinear Models [20/80] Variance Estimator for the M Estimator i 2 2 = = = i ( Estimator is E[ = [ This is the White estimator. See JW, p. 359. = q g H (1 / 2)[y exp( x )] (1 / 2)(y ) i i i i x i i i i i 2 i ) x x ; E[ H ]= x x i i i i i i i N i=1 -1 N i=1 N i=1 -1 ] [ x x H ] [ x x gg ]E[ H ] i i i i i N i=1 2 i -1 N i=1 2 i 2 i N i=1 2 i -1 e ][ x x i] i i i
Part 14: Nonlinear Models [21/80] Computing NLS Reject; hhninc=0$ Calc ; b0=log(xbr(hhninc))$ Nlsq ; lhs = hhninc ; fcn = exp(a0+a1*educ+a2*married+a3*age) ; start = b0, 0, 0, 0 ; labels = a0,a1,a2,a3$ Name ; x = one,educ,married,age$ Create; thetai = exp(x'b); ei = hhninc-thetai ; gi=ei*thetai ; hi = thetai*thetai$ Matrix; varM = <x'[hi] x> * x'[gi^2]x * <x'[hi] x> $ Matrix; stat(b,varm,x)$
Part 14: Nonlinear Models [22/80] Iterations 0 0 0 0 -1 0 0 = 'gradient' e X ( X ' X ) X ' e
Part 14: Nonlinear Models [23/80] NLS Estimates with Different Variance Estimators
Part 14: Nonlinear Models [24/80] Hypothesis Tests for M Estimation Null hypothesis: ( ) = for some set of J functions (1) continuous ( ) (2) differentiable; c 0 c = ( ), J K Jacobian R (3) functionally independent: Rank ( ) = J Example: Linear Restrictions R q = R 0
Part 14: Nonlinear Models [25/80] Wald Test Wald: given , V = Est.Asy.Var , W = Wald distance chi-squared[J] -1 = c ( ) { ( ) ( ) ( ) } R V R ' c ( ) D W
Part 14: Nonlinear Models [26/80] Change in the Criterion Function 1 n n P q= q(data , ) q*=E[q(data, )] i = i 1 Estimator minimizes q Estimator q q. 2n(q q) 0 minimizes q subject to restrictions ( )=0 c 0 D 0 chi squared[J]
Part 14: Nonlinear Models [27/80] Score Test LM Statistic Derivative of the objective function (1 /n) Score vector = n i=1 q(data , ) = (data, ) g i 0 = Without restrictions (data, ) With null hypothesis, ( )= imposed (data, ) generally not equal t g (Within sampling variability?) Wald distance = [ (data, LM chi squared[J] g 0 c 0 o . Is it close? 0 0 0 1 0 g )] {Var[ (data, ' g )]} [ (data, g )] D
Part 14: Nonlinear Models [28/80] Exponential Model f(Income| Age,Educ,Married) HHNINC 1exp = = + = = + i i i + exp(a a Educ a a Married 0 = a Age) i 0 1 2 3 Test H : a a 0 1 2 3
Part 14: Nonlinear Models [29/80] Wald Test Calc ; b0=log(xbr(hhninc))$ Nlsq ; lhs = hhninc ; fcn = exp(a0+a1*educ+a2*married+a3*age) ; start = b0, 0, 0, 0 ; labels = a0,a1,a2,a3$ Matrix ; List ; R = [0,1,0,0 / 0,0,1,0 / 0,0,0,1] ; c = R*b ; Vc = R*Varb*R ; Wald = c <VC> c $ Matrix R has 3 rows and 4 columns. 0.00000 1.00000 0.00000 0.000000 0.00000 0.000000 1.00000 0.000000 0.00000 0.000000 0.00000 1.00000 Matrix C has 3 rows and 1 columns. 0.05471 0.23761 0.00081 Matrix VC has 3 rows and 3 columns. .1053686D-05 .4530603D-06 .3649631D-07 .4530603D-06 .5859546D-04 -.3565863D-06 .3649631D-07 -.3565863D-06 .6940296D-07 Matrix WALD = 3627.17514
Part 14: Nonlinear Models [30/80] Change in Function Calc ; b0 = log(xbr(hhninc)) $ Nlsq ; lhs = hhninc ; labels = a0,a1,a2,a3 ; start = b0,0,0,0 ; fcn = exp(a0+a1*educ+a2*married+a3*age)$ Calc ; qbar = sumsqdev/n $ Nlsq ; lhs = hhninc ; labels = a0,a1,a2,a3 ; start = b0,0,0,0 ; fix = a1,a2,a3 ; fcn = exp(a0+a1*educ+a2*married+a3*age)$ Calc ; qbar0 = sumsqdev/n $ Calc ; cm = 2*n*(qbar0 qbar) $ (Sumsqdev = 763.767; Sumsqdev_0 = 854.682) 2(854.682 763.767) = 181.83
Part 14: Nonlinear Models [31/80] Constrained Estimation Was 763.767
Part 14: Nonlinear Models [32/80] LM Test 2 = + Function: q Derivative: LM statistic (1 / 2)[y e exp(a a Educ...)] i i 0 1 = g x i i i i i n n 1 n LM=( All evaluated at g )[ gg ] ( g ) = i 1 = i 1 = i 1 i i i a = log(y),0,0,0 0
Part 14: Nonlinear Models [33/80] LM Test Namelist; x = one,educ,married,age$ Nlsq ; lhs = hhninc ; labels = a0,a1,a2,a3 ; start = b0,0,0,0 ; fix = a1,a2,a3 ; fcn = exp(a0+a1*educ+a2*married+a3*age)$ Create ; thetai = exp(x'b) Create ; ei = hhninc - thetai$ Create ; gi = ei*thetai ; gi2 = gi*gi $ Matrix ; list ; LM = gi x * <x'[gi2]x> * x gi $ Matrix LM 1 +-------------- 1| 1915.03286
Part 14: Nonlinear Models [34/80] Maximum Likelihood Estimation Fully parametric estimation. Density of yi is fully specified The likelihood function = the joint density of the observed random variables. Example: density for the exponential model y 1 i = = i f(y | x ) exp , exp( x ) i i i i i x 2 i E[y | NLS (M) estimator examined earlier operated only on E[y | x ]= , Var[y | ]= i i i i i x ]= . i i i
Part 14: Nonlinear Models [35/80] The Likelihood Function = = x 1 n Likelihood f(y ,...,y | y 1 i = i f(y | x ) exp , exp( x ) i i i i i ,..., x ) 1 n y 1 n By independence, L( |data )= exp i i=1 i i The MLE , lnL( , maximizes the likelihood function nction. Therefore, MLE ) is a monotonic fu |data the MLE , , maximizes the log likelihood function y )= -ln MLE n lnL( |data i i i=1 i
Part 14: Nonlinear Models [36/80] Consistency and Asymptotic Normality of the MLE Conditions are identical to those for M estimation Terms in proofs are log density and its derivatives Nothing new is needed. Law of large numbers Lindberg-Feller central limit theorem applies to derivatives of the log likelihood.
Part 14: Nonlinear Models [37/80] Asymptotic Variance of the MLE Based on results for M estimation Asy.Var[ ] ={-E[Hessian]} {Var[first derivative]}{-E[Hessian]} MLE -1 -1 1 1 2 2 logL logL logL = -E Var -E
Part 14: Nonlinear Models [38/80] The Information Matrix Equality Fundamental Result for MLE The variance of the first derivative equals the negative of the expected second derivative. logL -E The Information Matrix = 2 MLE Asy.Var[ ] 1 1 2 2 2 logL logL logL = -E -E -E 1 2 logL = -E
Part 14: Nonlinear Models [39/80] Three Variance Estimators Negative inverse of expected second derivatives matrix. (Usually not known) Negative inverse of actual second derivatives matrix. Inverse of variance of first derivatives
Part 14: Nonlinear Models [40/80] Asymptotic Efficiency M estimator based on the conditional mean is semiparametric. Not necessarily efficient. MLE is fully parametric. It is efficient among all consistent and asymptotically normal estimators when the density is as specified. This is the Cramer-Rao bound. Note the implied comparison to nonlinear least squares for the exponential regression model.
Part 14: Nonlinear Models [41/80] Invariance Useful property of MLE If =g( ) is a continuous function of , the MLE of is g( E.g., in the exponential FE model, the MLE of = exp(- ) is exp(- MLE ) ) i i i,MLE
Part 14: Nonlinear Models [42/80] Log Likelihood Function y 1 i = = f(y | x ) exp , exp( x ) i i i i i i y 1 n i = L( |data )= exp , exp( x ) i i i=1 i i The MLE , logL( |data , maximizes the likelihood function ) is a monotonic function. Therefore MLE The MLE , , maxi mizes the log likelihood function y -log MLE n logL( |data )= i i i=1 i
Part 14: Nonlinear Models [43/80] Application: Exponential Regression MLE and NLS MLE assumes E[y|x] = exp(- x) Note sign reversal.
Part 14: Nonlinear Models [44/80] Variance Estimators lnL n i = i = i LnL ln y / , exp( x ) i i = i 1 n n = = + = i ) 1] g x (y / ) x [(y / x i i i i i i = i 1 = i 1 = Note, E[y | x ] , so E[ ]= g 0 i i i 2 lnL n i = H = (y / x x ) i i i = i 1 n i = E[ ] H x x = - X'X (known for this particular model) i = i 1
Part 14: Nonlinear Models [45/80] Three Variance Estimators Berndt-Hall-Hall-Hausman (BHHH) 1 1 2 i n n i = gg [(y / ) 1] x x i i i i i 1 = i=1 Based on actual second derivatives i 1 1 i n n = H (y / ) x x i i i i 1 = i=1 Based on expected second derivatives i 1 1 n n 1 = = E H x x ( X'X ) i i 1 = i=1 i
Part 14: Nonlinear Models [46/80] Robust (?) Estimator 1 1 n n n i H gg H i i i i=1 i=1 i=1 1 1 2 n n n i i i = i ) 1] (y / ) x x [(y / x x (y / ) x x i i i i i i i i = i 1 = i 1 = i 1 CLUSTER -- Replace center matrix with C N N g g c c ic ic = = i 1 = i 1 c 1 C N N = x [ (y / ) 1] x [(y / ) 1] c c ic ic ic ic ic ic = = i 1 = i 1 c 1
Part 14: Nonlinear Models [47/80] Variance Estimators Loglinear ; Lhs=hhninc;Rhs=x ; Model = Exponential create;thetai=exp(x'b);hi=hhninc*thetai;gi2=(hi-1)^2$ matr;he=<x'x>;ha=<x'[hi]x>;bhhh=<x'[gi2]x>$ matr;stat(b,ha);stat(b,he);stat(b,bhhh)$
Part 14: Nonlinear Models [48/80] Robust Standard Errors Exponential (Loglinear) Regression Model --------+-------------------------------------------------------------------- | Clustered Prob. 95% Confidence INCOME| Coefficient Std.Error z |z|>Z* Interval --------+-------------------------------------------------------------------- |Parameters in conditional mean function............................. Constant| -1.82539*** .02113 -86.37 .0000 -1.86681 -1.78397 EDUC| .05544*** .00126 43.90 .0000 .05296 .05791 MARRIED| .23666*** .00833 28.40 .0000 .22033 .25299 AGE| -.00087*** .00027 -3.20 .0014 -.00141 -.00034 --------+-------------------------------------------------------------------- NOTICE: The standard errors go down... Standard errors clustered on Fixed (27326 clusters)
Part 14: Nonlinear Models [49/80] Hypothesis Tests Trinity of tests for nested hypotheses Wald Likelihood ratio Lagrange multiplier All as defined for the M estimators
Part 14: Nonlinear Models [50/80] Example Exponential vs. Gamma P 1 i exp( y / )y (P) = Gamma Distribution: f(y | x , ,P) i i i i P i Exponential: P = 1 P > 1