Spatial Autoregressive models for large datasets using GS2SLS

Ajay Saharan

Join Date: Jan 2024

Posts: 12
#1

Spatial Autoregressive models for large datasets using GS2SLS

15 Feb 2024, 07:22

Hello, I am working on a Indian village level data. I am trying to check the spatial dependence in my dependent variable, which is a binary variable. I have a large dataset with more than 2,35,000 villages. I create a spatial lag of order 1 using

Code:

spgen depvar, lat(vill_lat) lon(vill_lon) swm(pow 2) dist(20) duni(km) large

following (Kondo 2018).

I am able to include the spatial lag in my OLS regression and it shows a highly significant coefficient. Similarly in case of logit model. But the spatial lag of the dependent variable is endogenous, and OLS estimates are inconsistent. It is suggested by Kondo (2018) that to overcome endogeneity, spatial econometric model is estimated by maximum likelihood, the method of IV, or generalized method of moments. Others also suggest that spatial autoregressive models with spatial autoregressive disturbances (SARAR models) estimated by general spatial two stage least squares (GS2SLS) proposed by Kelejian and Prucha (1998).

The standard state commands such as
"

Code:

spregress y x, gs2sls dvarlag(Wy)

"

does not work as I have generated my spatial lag with spgen command described above.

I tried using command

Code:

spregress y lagy xlist, gs2sls

It returns an error "3900".

Can someone suggest me some stata commands or manuals that can help in my case of large datasets? Or how to run the above mentioned models with the defined spatial lag and large dataset? How can I take care of the endogeneity concerns? Also, I want to introduce spatial lag of one of my independent variable of interest. What models should I estimate then?
Tags: None

Announcement

Spatial Autoregressive models for large datasets using GS2SLS