Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spatial Autoregressive models for large datasets using GS2SLS

    Hello, I am working on a Indian village level data. I am trying to check the spatial dependence in my dependent variable, which is a binary variable. I have a large dataset with more than 2,35,000 villages. I create a spatial lag of order 1 using
    Code:
     spgen depvar, lat(vill_lat) lon(vill_lon) swm(pow 2) dist(20) duni(km) large
    following (Kondo 2018).

    I am able to include the spatial lag in my OLS regression and it shows a highly significant coefficient. Similarly in case of logit model. But the spatial lag of the dependent variable is endogenous, and OLS estimates are inconsistent. It is suggested by Kondo (2018) that to overcome endogeneity, spatial econometric model is estimated by maximum likelihood, the method of IV, or generalized method of moments. Others also suggest that spatial autoregressive models with spatial autoregressive disturbances (SARAR models) estimated by general spatial two stage least squares (GS2SLS) proposed by Kelejian and Prucha (1998).

    The standard state commands such as
    "
    Code:
     spregress y x, gs2sls dvarlag(Wy)
    "

    does not work as I have generated my spatial lag with spgen command described above.

    I tried using command
    Code:
     spregress y lagy xlist, gs2sls
    It returns an error "3900".

    Can someone suggest me some stata commands or manuals that can help in my case of large datasets? Or how to run the above mentioned models with the defined spatial lag and large dataset? How can I take care of the endogeneity concerns? Also, I want to introduce spatial lag of one of my independent variable of interest. What models should I estimate then?
Working...
X