Spatdiag & Estat Moran Conflicting results

Harry O'Rahilly

Join Date: Feb 2018

Posts: 12
#1

Spatdiag & Estat Moran Conflicting results

12 Jun 2023, 07:14

Hello fellow stata users,

I am conducting a battery of tests on spatial data.

Two of which are:
spatdiag, authored by Maurizio Pisati

estat moran, from SSC

Having run these on my data the results for spatdiag Spatial error test for the Lagrange Multiplier produces the same result as does the estat moran test for Moran's I.

The result for the spatdiag the Spatial error test for Moran's I statistic is not the same as the estat moran test for Moran's I.

On closer inspection, the method and formula for estat moran is found here https://www.stata.com/manuals/spestatmoran.pdf on page 5, snapshot below:

Equation1

The Lagrange Multiplier Error formula is given by Anselin L. et al. (Regional Science and Urban Economics 26 (1996) 77-104) as being

Equation 2

Where T is term denoted in the denominator "tr" above in Equation 1 relating to Moran's I.
These are the same formula.

On the other hand, Moran's I is given as being as follows by Anselin, L. (Spatial econometrics methods and models (Studies in operational regional science) DOI 10.1007/978-94-015-7799-1)

Equation 3

This is not the same as Equation 1 for Moran's I. It could be the case that Pisati's code for Moran's I is based on Equation 3.

My questions are therefore:
Am I correct that the estat moran test for Moran's I is actually an LM error test?

If so, why is this the case?

If so, what are the implications?

If so, is Pisati's code under spatdiag the correct formulation for obtaining a Moran's I statistic

I would be grateful for your help. Thank you all,

Harry
Tags: Moran's I, spatdiag, spatial econometric
Harry O'Rahilly

Join Date: Feb 2018

Posts: 12
#2

12 Jun 2023, 15:27

In case the images non-visible

Estat Moran

The Lagrange error formula

Anselin's Moran's I Statistic formula
Comment

Harry O'Rahilly

Join Date: Feb 2018
Posts: 12

17 Jun 2023, 10:03

Hello

It may be of additional help in understanding my query to see code and results below.
I make use of the Texas data set provided in https://www.stata.com/manuals/sp.pdf

As you can see each post estimation test requires its own matrix with which to conduct calculations.

As raised in the first post of this chain, any help or observations on why Estat Moran and Spatdiag produce different Moran's I statistics (yet the Langrage Error in Spatdiag is the same as the Moran's I in Estat Moran) would be helpful.

Thank you

Harry

DO File

Code:

spshape2dta tl_2016_us_county
use    tl_2016_us_county, clear

generate long    fips = real(STATEFP + COUNTYFP)
bysort fips:   assert _N==1
assert fips != .
spset fips,   modify replace
 
spset, modify   coordsys(latlong, miles)
 
save, replace
 
//copy https://www.stata-press.com/data/r18/texas_ue.dta
 
use texas_ue, clear
 
merge 1:1 fips   using tl_2016_us_county
 
keep if _merge==3
 
drop _merge
 
rename NAME countyname
 
drop STATEFP COUNTYFP COUNTYNS GEOID
drop NAMELSAD LSAD CLASSFP MTFCC CSAFP
drop CBSAFP METDIVFP FUNCSTAT
drop ALAND AWATER INTPTLAT INTPTLON
save, replace
 
spmatrix create contiguity W
 
 
spmatrix export W                        using W_1_TX.csv, replace
 
 
clear
import delimited "C:\VV\W_1_TX.csv", delimiter(" ") varnames(1)
 
drop v1
save "W_1_TX", replace
 
spatwmat            using W_1_TX.dta, name (W_1_TX_1) 
 
use texas_ue, clear
 
regress unemployment college
 
estat moran, errorlag(W)
spatdiag, weights (W_1_TX_1)

¶

OUTPUT

Code:

. regress unemployment college

      Source |       SS           df       MS      Number of obs   =       254
-------------+----------------------------------   F(1, 252)       =     57.92
       Model |  139.314746         1  139.314746   Prob > F        =    0.0000
    Residual |  606.129539       252  2.40527595   R-squared       =    0.1869
-------------+----------------------------------   Adj R-squared   =    0.1837
       Total |  745.444285       253   2.9464201   Root MSE        =    1.5509

------------------------------------------------------------------------------
unemployment |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     college |  -.1008791   .0132552    -7.61   0.000    -.1269842   -.0747741
       _cons |   6.542796   .2571722    25.44   0.000     6.036316    7.049277
------------------------------------------------------------------------------

.  estat moran, errorlag(W)

Moran test for spatial dependence
         Ho: error is i.i.d. 
         Errorlags:  W

         chi2(1)      =    94.06
         Prob > chi2  =   0.0000

. 
end of do-file

. do "C:\VV\STD4ce0_000000.tmp"

. spatdiag,                               weights                 (W_1_TX_1) 


Diagnostic tests for spatial dependence in OLS regression


Fitted model
------------------------------------------------------------
unemployment = college
------------------------------------------------------------

Weights matrix
------------------------------------------------------------
Name: W_1_TX_1
Type: Imported (non-binary)
Row-standardized: No
------------------------------------------------------------

Diagnostics
------------------------------------------------------------
Test                           |  Statistic    df   p-value
-------------------------------+----------------------------
Spatial error:                 |
  Moran's I                    |    11.488      1    0.000
  Lagrange multiplier          |    94.057      1    0.000
  Robust Lagrange multiplier   |    75.283      1    0.000
                               |
Spatial lag:                   |
  Lagrange multiplier          |    21.331      1    0.000
  Robust Lagrange multiplier   |     2.557      1    0.110
------------------------------------------------------------

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2562
#4

19 Jun 2023, 04:13

The Burridge (1980) LM formula (equation (9) in Anselin et al. (1996)) equals the square of the formula in the Stata manual. Note that T = tr((W'=W)*W).

Taking the square seems a bit odd because the numerator is already a quadratic form. I would trust that the official estat moran command does what it claims in the documentation. A definite answer can only be given by coding the formulas yourself (which requires a bit of effort but should not be too difficult). If you find an answer, I would be interested in hearing about it.

https://www.kripfganz.de/stata/
Comment
Harry O'Rahilly

Join Date: Feb 2018

Posts: 12
#5

19 Jun 2023, 15:46

Hello Sebastian

Thank you for your reply.

I see, so LM [Burridge] = Moran's I²[Stata Manual]

I'll pursue this further, but if you you are in a position to offer an explanation as to why in the example above Moran's I is 11.488 (in Spatdiag) and 94.06 in (Estat Moran) I would be grateful.
Likewise, I'll keep you posted on how I get on.

Harry
Comment
Alexander Koplenig

Join Date: Jul 2014

Posts: 39
#6

13 Feb 2024, 08:47

Right after the formula, the Stata manual says that

Under appropriate assumptions, it follows from Kelejian and Prucha (2001) that I ∼ N(0, 1)
and I² ∼ χ²(1)

estat moran reports the χ² statistic. So apart from a scaling factor, the assumption that

LM [Burridge] = Moran's I²[Stata Manual]

is correct, see Section 5 of Anselin, Luc. 2003. Spatial Econometrics. In Badi H. Baltagi (ed.), A Companion to Theoretical Econometrics, 310–330. 1st edn. Wiley. https://doi.org/10.1002/9780470996249.ch15.

Ali
Comment

Alexander Koplenig

Join Date: Jul 2014
Posts: 39

14 Feb 2024, 02:47

On closer inspection, and after reading this: https://www.msarrias.com/uploads/3/7...9/lecture6.pdf, it's more accurate to say that Stata reports the square of the modified Moran's I, as suggested by Kelijan and Prucha (2001):

Code:

 . *Generate example data
. *--------------------------------------------
. /*
> # Example y
> y = [100, 100, -100, -100, 0, 0]
>
> # Example spatial weight matrix
> W = ([[0, 1, 0, 0, 0, 0],
>               [1, 0, 0, 0, 0, 0],
>               [0, 0, 0, 1, 0, 0],
>               [0, 0, 1, 0, 0, 0],
>               [0, 0, 0, 0, 0, 1],
>               [0, 0, 0, 0, 1, 0]])
>
>
> */
. *--------------------------------------------
. clear
 
. set obs 6
Number of observations (_N) was 0, now 6.
 
.
. gen ID = _n
 
.
. gen y = 100
 
. replace y = 100 in 2
(0 real changes made)
 
. replace y = -100 in 3
(1 real change made)
 
. replace y = -100 in 4
(1 real change made)
 
. replace y = 0 in 5
(1 real change made)
 
. replace y = 0 in 6
(1 real change made)
 
.
.
. gen w1 = 0
 
. replace w1 = 1 in 2
(1 real change made)
 
. replace w1 = 0 in 3
(0 real changes made)
 
. replace w1 = 0 in 4
(0 real changes made)
 
. replace w1 = 0 in 5
(0 real changes made)
 
. replace w1 = 0 in 6
(0 real changes made)
 
.
. gen w2 = 1
 
. replace w2 = 0 in 2
(1 real change made)
 
. replace w2 = 0 in 3
(1 real change made)
 
. replace w2 = 0 in 4
(1 real change made)
 
. replace w2 = 0 in 5
(1 real change made)
 
. replace w2 = 0 in 6
(1 real change made)
 
.
. gen w3 = 0
 
. replace w3 = 0 in 2
(0 real changes made)
 
. replace w3 = 0 in 3
(0 real changes made)
 
. replace w3 = 1 in 4
(1 real change made)
 
. replace w3 = 0 in 5
(0 real changes made)
 
. replace w3 = 0 in 6
(0 real changes made)
 
.
. gen w4 = 0
 
. replace w4 = 0 in 2
(0 real changes made)
 
. replace w4 = 1 in 3
(1 real change made)
 
. replace w4 = 0 in 4
(0 real changes made)
 
. replace w4 = 0 in 5
(0 real changes made)
 
. replace w4 = 0 in 6
(0 real changes made)
 
.
. gen w5 = 0
 
. replace w5 = 0 in 2
(0 real changes made)
 
. replace w5 = 0 in 3
(0 real changes made)
 
. replace w5 = 0 in 4
(0 real changes made)
 
. replace w5 = 0 in 5
(0 real changes made)
 
. replace w5 = 1 in 6
(1 real change made)
 
.
. gen w6 = 0
 
. replace w6 = 0 in 2
(0 real changes made)
 
. replace w6 = 0 in 3
(0 real changes made)
 
. replace w6 = 0 in 4
(0 real changes made)
 
. replace w6 = 1 in 5
(1 real change made)
 
. replace w6 = 0 in 6
(0 real changes made)
 
.
. *--------------------------------------------
. *Spset + generate spatial weight matrix
. *--------------------------------------------
. spset ID
 
      Sp dataset: <current>
Linked shapefile: <none>
            Data: Cross sectional
 Spatial-unit ID: _ID (equal to ID)
     Coordinates: <none>
 
.
. spmatrix fromdata W = w*, normalize(none)  replace
 
.
. regress y
 
      Source |       SS           df       MS      Number of obs   =         6
-------------+----------------------------------   F(0, 5)         =      0.00
       Model |           0         0           .   Prob > F        =         .
    Residual |       40000         5        8000   R-squared       =    0.0000
-------------+----------------------------------   Adj R-squared   =    0.0000
       Total |       40000         5        8000   Root MSE        =    89.443
 
------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |          0   36.51484     0.00   1.000    -93.86438    93.86438
------------------------------------------------------------------------------
 
.
. predict residuals, resid
 
.
. tempfile TEMP
 
. save `TEMP', replace
(file /tmp/St97545.000001 not found)
file /tmp/St97545.000001 saved as .dta format
 
.
. *--------------------------------------------
. **Compute in Mata:
. *original Moran's I
. *modified Moran's I due to Kelejian and Prucha (2001)
. *--------------------------------------------
.
. spmatrix matafromsp Cmat v = W
 
. putmata residuals, replace
(1 vector posted)
 
.
. *Original I = (N / (sum_i sum_j Wij)) * (residuals' * W * residuals) / (residuals' * residuals)
. *Kelejian and Prucha (2001) I = (u'*W*u)/(sigma^2*[tr{(W'+W)*W}]^.5) where u is the residuals, W is the spatial weight matrix, w is the residuals, and sigma^2 = u'u/n
. *Kelejian and Prucha (2001) I = (residuals' * W * residuals)/ [(residuals' * residuals/length(n) * sqrt(trace((W' + W) * W))]
. mata {
>     n = length(residuals)
>     res_tr_res = residuals' * residuals
>    
>     I_original = n / (sum(Cmat)) * (residuals' * Cmat * residuals) / res_tr_res
>     I_KP = (residuals' * Cmat * residuals) / ((res_tr_res/n)* sqrt(trace((Cmat'+ Cmat) * Cmat)))
>
>     st_numscalar("I_original", I_original)
>     st_numscalar("I_KP", I_KP)
>
> }
 
. *--------------------------------------------
. *Estimate spatial diagnostics
. *--------------------------------------------
. use `TEMP', clear
 
.
. regress y
 
      Source |       SS           df       MS      Number of obs   =         6
-------------+----------------------------------   F(0, 5)         =      0.00
       Model |           0         0           .   Prob > F        =         .
    Residual |       40000         5        8000   R-squared       =    0.0000
-------------+----------------------------------   Adj R-squared   =    0.0000
       Total |       40000         5        8000   Root MSE        =    89.443
 
------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |          0   36.51484     0.00   1.000    -93.86438    93.86438
------------------------------------------------------------------------------
 
.
. *In Stata
. estat moran, errorlag(W)
 
Moran test for spatial dependence
         H0: Error terms are i.i.d.
         Errorlags:  W
 
         chi2(1)      =     3.00
         Prob > chi2  =   0.0833
 
.
. *From Mata:
. *Original I
. display scalar(I_original)
1
 
. *Kelejian and Prucha (2001) I
. display scalar(I_KP)
1.7320508
 
. *I_KP squared
. display scalar(I_KP)^2
3
. *chi2(1) test
. display 1-chi2(1,scalar(I_KP)^2)
.08326452

Best,
Ali

Last edited by Alexander Koplenig; 14 Feb 2024, 02:51.

Comment

Harry O'Rahilly

Join Date: Feb 2018

Posts: 12
#8

09 Apr 2024, 23:32

Thank you Ali. This is very helpful.
Comment

Announcement