Asymptotic variance of the variance estimator in a multivariate regression using maximum likelihood

Futoshi Narita

Join Date: May 2016
Posts: 13

Asymptotic variance of the variance estimator in a multivariate regression using maximum likelihood

25 Jul 2017, 09:03

Dear all,

I have a question on how to obtain the asymptotic variance of the estimator of the variance of the error terms in a multivariate regression under maximum likelihood in a simple i.i.d. situation.

It should be derived as the inverse of Hessian or the squared "score" in theory (e.g., Hayashi, 2000, p.475, Proposition 7.9; as in many other textbooks), but my calculation gives me a different result than Stata computes. I also found that my old version of Greene's textbook (4th edition, p.625, equation 15-55) has a formula different than my calculation, but Stata's result is different from this either.

I wrote a code (please find below) to compare these three in a bi-variate regression. The differences between mine and Stata's result are up to constant factors (2 and 4) depending on elements of the variance matrix, but regardless of the data. Greene's formula is closer to Stata's result but the [2,2] element differs by a non-constant factor depending on data.

Could anyone help me find the exact formula that Stata uses, and how to derive it?

Any references to a textbook or a paper would also be highly appreciated. Hayashi (2000) states that "We will not be concerned with the asymptotic normality of the ML estimator of \Omega_o", i.e., the variance estimator. My version of Greene's textbook noted that "It would be rarely used, but if needed" and no derivation is there and this entire part including the formula (15-55) is dropped in a newer version. Lack of detailed references is my key constraint.

I've spent more than a week to resolve this issue and do appreciate any reactions/comments on this issue.

Thank you!
Futoshi

Code:

// ml_variance
discard
clear all
clear mata
clear matrix 
set more off

webuse m1gdp
qui su
local nobs = r(N)
replace ln_gdp = ln_gdp*100
replace ln_m1 = ln_m1*100

// Change data randomly to see how formulas work
gen double v1 = invnormal(uniform())
gen double v2 = invnormal(uniform())
replace ln_gdp = 100 + v1*1.5 + v2*3
replace ln_m1 = 100 - v1*1.0 + v2*2

// Basic var with two lags.
var d.ln_gdp d.ln_m1
mat Vx = e(Sigma)

// Use ml to implement the bivariate VAR
capture program drop my_var
program my_var
    version 14
    args lnfj mu1 mu2 lns1 lns2 s12
    tempvar res1 res2
    quietly gen double `res1' = $ML_y1 - `mu1'
    quietly gen double `res2' = $ML_y2 - `mu2'
    quietly replace `lnfj' = -((`res1')^2/exp(`lns1') + ///
                             (-2*((`s12')/sqrt(exp(`lns1')*exp(`lns2'))))*(`res1'*`res2')/sqrt(exp(`lns1')*exp(`lns2')) + ///
                             (`res2')^2/exp(`lns2'))/(2*(1-((`s12')/sqrt(exp(`lns1')*exp(`lns2')))^2)) - ///
                             ln(2*_pi*sqrt(exp(`lns1')*exp(`lns2')-(`s12')^2))
end    
ml model lf my_var (d.ln_gdp=L.d.ln_gdp L2.d.ln_gdp L.d.ln_m1 L2.d.ln_m1) ///
                   (d.ln_m1=L.d.ln_gdp L2.d.ln_gdp L.d.ln_m1 L2.d.ln_m1) /lns1 /lns2 /s12, ///
    diparm(lns1, exp label("s1")) diparm(lns2, exp label("s2"))
ml maximize
nlcom (exp(_b[lns1:_cons])) ///
      (_b[s12:_cons]) ///
      (exp(_b[lns2:_cons])) ///
      , post
mat Vsmln = e(V)

// Greene's formula (4th edition, p.625, equation 15-55)
mat Vx_kron_Vx = Vx#Vx
mat Vsg4 = Vx_kron_Vx/(`nobs'-2)*2
mat Vsg = (Vsg4[1..2, 1..2], Vsg4[1..2, 4 ] \ Vsg4[4, 1..2], Vsg4[4,4])

// A formula derived from the inverse of the squared "score" evaluated at the true parameter values,
// calculated in a symbolic math tool.
mat Vsf = (  Vx_kron_Vx[1, 1], 2*Vx_kron_Vx[1, 2], Vx_kron_Vx[1, 4] \ ///
           2*Vx_kron_Vx[2, 1], 2*Vx_kron_Vx[2, 2]+2*Vx_kron_Vx[1, 4], 2*Vx_kron_Vx[2, 4] \ ///
             Vx_kron_Vx[4, 1], 2*Vx_kron_Vx[4, 2], Vx_kron_Vx[4, 4] )
mat Vsf = Vsf/(`nobs'-2)*2

mat list Vsmln
mat list Vsg
mat list Vsf

di `nobs'

// Ratios between Greene's formula and Stata results
di Vsg[1,1]/Vsmln[1,1]
di Vsg[2,1]/Vsmln[2,1]
di Vsg[2,2]/Vsmln[2,2]
di Vsg[3,1]/Vsmln[3,1]
di Vsg[3,2]/Vsmln[3,2]
di Vsg[3,3]/Vsmln[3,3]

// Ratios between a formula derived by me and Stata results
di Vsf[1,1]/Vsmln[1,1]
di Vsf[2,1]/Vsmln[2,1]
di Vsf[2,2]/Vsmln[2,2]
di Vsf[3,1]/Vsmln[3,1]
di Vsf[3,2]/Vsmln[3,2]
di Vsf[3,3]/Vsmln[3,3]

exit

Tags: None

Announcement

Asymptotic variance of the variance estimator in a multivariate regression using maximum likelihood