Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spatdiag & Estat Moran Conflicting results

    Hello fellow stata users,

    I am conducting a battery of tests on spatial data.

    Two of which are:
    1. spatdiag, authored by Maurizio Pisati
    2. estat moran, from SSC
    Having run these on my data the results for spatdiag Spatial error test for the Lagrange Multiplier produces the same result as does the estat moran test for Moran's I.

    The result for the spatdiag the Spatial error test for Moran's I statistic is not the same as the estat moran test for Moran's I.

    On closer inspection, the method and formula for estat moran is found here https://www.stata.com/manuals/spestatmoran.pdf on page 5, snapshot below:

    Equation1


    The Lagrange Multiplier Error formula is given by Anselin L. et al. (Regional Science and Urban Economics 26 (1996) 77-104) as being

    Equation 2

    Where T is term denoted in the denominator "tr" above in Equation 1 relating to Moran's I.
    These are the same formula.

    On the other hand, Moran's I is given as being as follows by Anselin, L. (Spatial econometrics methods and models (Studies in operational regional science) DOI 10.1007/978-94-015-7799-1)

    Equation 3


    This is not the same as Equation 1 for Moran's I. It could be the case that Pisati's code for Moran's I is based on Equation 3.

    My questions are therefore:
    1. Am I correct that the estat moran test for Moran's I is actually an LM error test?
    2. If so, why is this the case?
    3. If so, what are the implications?
    4. If so, is Pisati's code under spatdiag the correct formulation for obtaining a Moran's I statistic
    I would be grateful for your help. Thank you all,

    Harry

  • #2
    In case the images non-visible

    Estat Moran
    Click image for larger version

Name:	2023 06 12 Estat Moran Formula v01.png
Views:	1
Size:	164.7 KB
ID:	1716924

    The Lagrange error formula

    Click image for larger version

Name:	2023 06 12 LM error Formula v02.png
Views:	1
Size:	26.3 KB
ID:	1716925

    Anselin's Moran's I Statistic formula

    Click image for larger version

Name:	2023 06 12 Moran's I Formula Anselin v02.png
Views:	1
Size:	75.0 KB
ID:	1716926

    Comment


    • #3
      Hello

      It may be of additional help in understanding my query to see code and results below.
      I make use of the Texas data set provided in https://www.stata.com/manuals/sp.pdf

      As you can see each post estimation test requires its own matrix with which to conduct calculations.

      As raised in the first post of this chain, any help or observations on why Estat Moran and Spatdiag produce different Moran's I statistics (yet the Langrage Error in Spatdiag is the same as the Moran's I in Estat Moran) would be helpful.

      Thank you

      Harry

      DO File

      Code:
      spshape2dta tl_2016_us_county
      use    tl_2016_us_county, clear
      
      generate long    fips = real(STATEFP + COUNTYFP)
      bysort fips:   assert _N==1
      assert fips != .
      spset fips,   modify replace
       
      spset, modify   coordsys(latlong, miles)
       
      save, replace
       
      //copy https://www.stata-press.com/data/r18/texas_ue.dta
       
      use texas_ue, clear
       
      merge 1:1 fips   using tl_2016_us_county
       
      keep if _merge==3
       
      drop _merge
       
      rename NAME countyname
       
      drop STATEFP COUNTYFP COUNTYNS GEOID
      drop NAMELSAD LSAD CLASSFP MTFCC CSAFP
      drop CBSAFP METDIVFP FUNCSTAT
      drop ALAND AWATER INTPTLAT INTPTLON
      save, replace
       
      spmatrix create contiguity W
       
       
      spmatrix export W                        using W_1_TX.csv, replace
       
       
      clear
      import delimited "C:\VV\W_1_TX.csv", delimiter(" ") varnames(1)
       
      drop v1
      save "W_1_TX", replace
       
      spatwmat            using W_1_TX.dta, name (W_1_TX_1) 
       
      use texas_ue, clear
       
      regress unemployment college
       
      estat moran, errorlag(W)
      spatdiag, weights (W_1_TX_1)


      OUTPUT

      Code:
      . regress unemployment college
      
            Source |       SS           df       MS      Number of obs   =       254
      -------------+----------------------------------   F(1, 252)       =     57.92
             Model |  139.314746         1  139.314746   Prob > F        =    0.0000
          Residual |  606.129539       252  2.40527595   R-squared       =    0.1869
      -------------+----------------------------------   Adj R-squared   =    0.1837
             Total |  745.444285       253   2.9464201   Root MSE        =    1.5509
      
      ------------------------------------------------------------------------------
      unemployment |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           college |  -.1008791   .0132552    -7.61   0.000    -.1269842   -.0747741
             _cons |   6.542796   .2571722    25.44   0.000     6.036316    7.049277
      ------------------------------------------------------------------------------
      
      .  estat moran, errorlag(W)
      
      Moran test for spatial dependence
               Ho: error is i.i.d. 
               Errorlags:  W
      
               chi2(1)      =    94.06
               Prob > chi2  =   0.0000
      
      . 
      end of do-file
      
      . do "C:\VV\STD4ce0_000000.tmp"
      
      . spatdiag,                               weights                 (W_1_TX_1) 
      
      
      Diagnostic tests for spatial dependence in OLS regression
      
      
      Fitted model
      ------------------------------------------------------------
      unemployment = college
      ------------------------------------------------------------
      
      Weights matrix
      ------------------------------------------------------------
      Name: W_1_TX_1
      Type: Imported (non-binary)
      Row-standardized: No
      ------------------------------------------------------------
      
      Diagnostics
      ------------------------------------------------------------
      Test                           |  Statistic    df   p-value
      -------------------------------+----------------------------
      Spatial error:                 |
        Moran's I                    |    11.488      1    0.000
        Lagrange multiplier          |    94.057      1    0.000
        Robust Lagrange multiplier   |    75.283      1    0.000
                                     |
      Spatial lag:                   |
        Lagrange multiplier          |    21.331      1    0.000
        Robust Lagrange multiplier   |     2.557      1    0.110
      ------------------------------------------------------------

      Comment


      • #4
        The Burridge (1980) LM formula (equation (9) in Anselin et al. (1996)) equals the square of the formula in the Stata manual. Note that T = tr((W'=W)*W).

        Taking the square seems a bit odd because the numerator is already a quadratic form. I would trust that the official estat moran command does what it claims in the documentation. A definite answer can only be given by coding the formulas yourself (which requires a bit of effort but should not be too difficult). If you find an answer, I would be interested in hearing about it.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Hello Sebastian

          Thank you for your reply.

          I see, so LM [Burridge] = Moran's I2 [Stata Manual]

          I'll pursue this further, but if you you are in a position to offer an explanation as to why in the example above Moran's I is 11.488 (in Spatdiag) and 94.06 in (Estat Moran) I would be grateful.
          Likewise, I'll keep you posted on how I get on.

          Harry

          Comment


          • #6
            Right after the formula, the Stata manual says that
            Under appropriate assumptions, it follows from Kelejian and Prucha (2001) that IN(0, 1)
            and I2χ2(1)
            estat moran reports the χ2 statistic. So apart from a scaling factor, the assumption that
            LM [Burridge] = Moran's I2 [Stata Manual]
            is correct, see Section 5 of Anselin, Luc. 2003. Spatial Econometrics. In Badi H. Baltagi (ed.), A Companion to Theoretical Econometrics, 310–330. 1st edn. Wiley. https://doi.org/10.1002/9780470996249.ch15.

            Ali

            Comment


            • #7
              On closer inspection, and after reading this: https://www.msarrias.com/uploads/3/7...9/lecture6.pdf, it's more accurate to say that Stata reports the square of the modified Moran's I, as suggested by Kelijan and Prucha (2001):
              Code:
               . *Generate example data
              . *--------------------------------------------
              . /*
              > # Example y
              > y = [100, 100, -100, -100, 0, 0]
              >
              > # Example spatial weight matrix
              > W = ([[0, 1, 0, 0, 0, 0],
              >               [1, 0, 0, 0, 0, 0],
              >               [0, 0, 0, 1, 0, 0],
              >               [0, 0, 1, 0, 0, 0],
              >               [0, 0, 0, 0, 0, 1],
              >               [0, 0, 0, 0, 1, 0]])
              >
              >
              > */
              . *--------------------------------------------
              . clear
               
              . set obs 6
              Number of observations (_N) was 0, now 6.
               
              .
              . gen ID = _n
               
              .
              . gen y = 100
               
              . replace y = 100 in 2
              (0 real changes made)
               
              . replace y = -100 in 3
              (1 real change made)
               
              . replace y = -100 in 4
              (1 real change made)
               
              . replace y = 0 in 5
              (1 real change made)
               
              . replace y = 0 in 6
              (1 real change made)
               
              .
              .
              . gen w1 = 0
               
              . replace w1 = 1 in 2
              (1 real change made)
               
              . replace w1 = 0 in 3
              (0 real changes made)
               
              . replace w1 = 0 in 4
              (0 real changes made)
               
              . replace w1 = 0 in 5
              (0 real changes made)
               
              . replace w1 = 0 in 6
              (0 real changes made)
               
              .
              . gen w2 = 1
               
              . replace w2 = 0 in 2
              (1 real change made)
               
              . replace w2 = 0 in 3
              (1 real change made)
               
              . replace w2 = 0 in 4
              (1 real change made)
               
              . replace w2 = 0 in 5
              (1 real change made)
               
              . replace w2 = 0 in 6
              (1 real change made)
               
              .
              . gen w3 = 0
               
              . replace w3 = 0 in 2
              (0 real changes made)
               
              . replace w3 = 0 in 3
              (0 real changes made)
               
              . replace w3 = 1 in 4
              (1 real change made)
               
              . replace w3 = 0 in 5
              (0 real changes made)
               
              . replace w3 = 0 in 6
              (0 real changes made)
               
              .
              . gen w4 = 0
               
              . replace w4 = 0 in 2
              (0 real changes made)
               
              . replace w4 = 1 in 3
              (1 real change made)
               
              . replace w4 = 0 in 4
              (0 real changes made)
               
              . replace w4 = 0 in 5
              (0 real changes made)
               
              . replace w4 = 0 in 6
              (0 real changes made)
               
              .
              . gen w5 = 0
               
              . replace w5 = 0 in 2
              (0 real changes made)
               
              . replace w5 = 0 in 3
              (0 real changes made)
               
              . replace w5 = 0 in 4
              (0 real changes made)
               
              . replace w5 = 0 in 5
              (0 real changes made)
               
              . replace w5 = 1 in 6
              (1 real change made)
               
              .
              . gen w6 = 0
               
              . replace w6 = 0 in 2
              (0 real changes made)
               
              . replace w6 = 0 in 3
              (0 real changes made)
               
              . replace w6 = 0 in 4
              (0 real changes made)
               
              . replace w6 = 1 in 5
              (1 real change made)
               
              . replace w6 = 0 in 6
              (0 real changes made)
               
              .
              . *--------------------------------------------
              . *Spset + generate spatial weight matrix
              . *--------------------------------------------
              . spset ID
               
                    Sp dataset: <current>
              Linked shapefile: <none>
                          Data: Cross sectional
               Spatial-unit ID: _ID (equal to ID)
                   Coordinates: <none>
               
              .
              . spmatrix fromdata W = w*, normalize(none)  replace
               
              .
              . regress y
               
                    Source |       SS           df       MS      Number of obs   =         6
              -------------+----------------------------------   F(0, 5)         =      0.00
                     Model |           0         0           .   Prob > F        =         .
                  Residual |       40000         5        8000   R-squared       =    0.0000
              -------------+----------------------------------   Adj R-squared   =    0.0000
                     Total |       40000         5        8000   Root MSE        =    89.443
               
              ------------------------------------------------------------------------------
                         y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                     _cons |          0   36.51484     0.00   1.000    -93.86438    93.86438
              ------------------------------------------------------------------------------
               
              .
              . predict residuals, resid
               
              .
              . tempfile TEMP
               
              . save `TEMP', replace
              (file /tmp/St97545.000001 not found)
              file /tmp/St97545.000001 saved as .dta format
               
              .
              . *--------------------------------------------
              . **Compute in Mata:
              . *original Moran's I
              . *modified Moran's I due to Kelejian and Prucha (2001)
              . *--------------------------------------------
              .
              . spmatrix matafromsp Cmat v = W
               
              . putmata residuals, replace
              (1 vector posted)
               
              .
              . *Original I = (N / (sum_i sum_j Wij)) * (residuals' * W * residuals) / (residuals' * residuals)
              . *Kelejian and Prucha (2001) I = (u'*W*u)/(sigma^2*[tr{(W'+W)*W}]^.5) where u is the residuals, W is the spatial weight matrix, w is the residuals, and sigma^2 = u'u/n
              . *Kelejian and Prucha (2001) I = (residuals' * W * residuals)/ [(residuals' * residuals/length(n) * sqrt(trace((W' + W) * W))]
              . mata {
              >     n = length(residuals)
              >     res_tr_res = residuals' * residuals
              >    
              >     I_original = n / (sum(Cmat)) * (residuals' * Cmat * residuals) / res_tr_res
              >     I_KP = (residuals' * Cmat * residuals) / ((res_tr_res/n)* sqrt(trace((Cmat'+ Cmat) * Cmat)))
              >
              >     st_numscalar("I_original", I_original)
              >     st_numscalar("I_KP", I_KP)
              >
              > }
               
              . *--------------------------------------------
              . *Estimate spatial diagnostics
              . *--------------------------------------------
              . use `TEMP', clear
               
              .
              . regress y
               
                    Source |       SS           df       MS      Number of obs   =         6
              -------------+----------------------------------   F(0, 5)         =      0.00
                     Model |           0         0           .   Prob > F        =         .
                  Residual |       40000         5        8000   R-squared       =    0.0000
              -------------+----------------------------------   Adj R-squared   =    0.0000
                     Total |       40000         5        8000   Root MSE        =    89.443
               
              ------------------------------------------------------------------------------
                         y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                     _cons |          0   36.51484     0.00   1.000    -93.86438    93.86438
              ------------------------------------------------------------------------------
               
              .
              . *In Stata
              . estat moran, errorlag(W)
               
              Moran test for spatial dependence
                       H0: Error terms are i.i.d.
                       Errorlags:  W
               
                       chi2(1)      =     3.00
                       Prob > chi2  =   0.0833
               
              .
              . *From Mata:
              . *Original I
              . display scalar(I_original)
              1
               
              . *Kelejian and Prucha (2001) I
              . display scalar(I_KP)
              1.7320508
               
              . *I_KP squared
              . display scalar(I_KP)^2
              3
              . *chi2(1) test
              . display 1-chi2(1,scalar(I_KP)^2)
              .08326452
              Best,
              Ali
              Last edited by Alexander Koplenig; 14 Feb 2024, 02:51.

              Comment


              • #8
                Thank you Ali. This is very helpful.

                Comment

                Working...
                X