Error using spgenerate

Lom Newton

Join Date: Jun 2019
Posts: 46

Error using spgenerate

08 Aug 2020, 12:35

Hello Statalisters,

I am trying to create spatial lag variable manually using spgenerate command. I first create my spatial weight matrix and the use that to multiply the variable I want to spatially lag. Below is my code:

Code:

**xtset the data.
xtset ID YEAR

save mypanel, replace

**Spset data
use mypanel
spset ID, coord(lon lat)

**Set the coordinate units, if necessary
spset, modify coordsys(latlong)

spset, modify coordsys(latlong, miles)


**Save the data
save, replace


***Create inverse distance weight matrix W for point locations within 0.5miles radius

 spmatrix create idistance W if YEAR == 2009, vtruncate(2) normalize(row)

 spmatrix summarize W

Weighting matrix  W
---------------------------------------
           Type |            idistance
  Normalization |                  row
      Dimension |          2105 x 2105
Elements        |
   minimum      |                    0
   minimum > 0  |             .0572685
   mean         |             .0003566
   max          |                    1
---------------------------------------

Now to spatially lag a variable, fee:

Code:

spgenerate Wfee = W*fee

But I got the following error

Code:

_IDs in weighting matrix W do not match _IDs in estimation sample
    There are places in W not in estimation sample and places in estimation sample not in W05.

Is this because of the "islands" ? Or how do I resolve this?

Any help is much appreciated.

Thank you.

Tags: spgenerate, spset

Lom Newton

Join Date: Jun 2019

Posts: 46
#2

09 Aug 2020, 20:36

Dear Statalisters,

Does anyone have an idea how to resolve my error, please?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10194
#3

10 Aug 2020, 05:44

FAQ 12 advises you to present a reproducible example to increase your chances of obtaining helpful replies. This will probably be the issue, but I cannot guarantee it because I am not able to test the proposed solution and I am unwilling to create a reproducible example on your behalf.

spmatrix create idistance W if YEAR == 2009, vtruncate(2) normalize(row)

Here you create a matrix using a condition, i.e., "if YEAR == 2009". Thereafter, you are instructing Stata to generate a variable with the full set of observations

Now to spatially lag a variable, fee:
Code:
spgenerate Wfee = W*fee

Then it makes sense that Stata complains

_IDs in weighting matrix W do not match _IDs in estimation sample
There are places in W not in estimation sample and places in estimation sample not in W05.

Depending on whether spgenerate allows the -if- qualifier, you may do the following:

Code:

spgenerate Wfee = W*fee if YEAR==2009

or alternatively

Code:

preserve keep if YEAR==2009 spgenerate Wfee = W*fee

save the variable and identifiers, restore and merge back to the full dataset.
Comment
Lom Newton

Join Date: Jun 2019

Posts: 46
#4

12 Aug 2020, 22:04

Thank you Andrew Musau. Actually, this is a strongly balanced panel data and the spset variable, ID uniquely identify each panel across the years.

Code:

//Verify that unit and YEAR jointly identify the observations assert ID!=. . assert YEAR!=. . bysort ID YEAR: assert _N==1 .

Using the -if- qualifier produces missing values for the Wfee variable for all other years except 2009. And for the suggested alternative, I am not sure how that works though. My data is somewhat private. But I will try to see how I can generate a reproducible sample so I can share it here to see if you or any other person can help me out on this.

Thanks.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10194

13 Aug 2020, 01:03

Using the -if- qualifier produces missing values for the Wfee variable for all other years except 2009.

Why are you surprised by this? It's exactly what you asked for by using the -if- qualifier.

But I will try to see how I can generate a reproducible sample so I can share it here to see if you or any other person can help me out on this.

I will save you the work. The following replicates your error and shows that my suggestion in #3 is right on the money. There is no evidence that you tried what was suggested.

Code:

copy https://www.stata-press.com/data/r16/homicide1990.dta .
copy https://www.stata-press.com/data/r16/homicide1990_shp.dta .
use homicide1990, clear
spset
spmatrix create contiguity W if _n<=500
spgenerate W_gini = W*gini
spgenerate W_gini = W*gini if _n<=500

Res.:

Code:

. copy https://www.stata-press.com/data/r16/homicide1990.dta .

. copy https://www.stata-press.com/data/r16/homicide1990_shp.dta .

.
. use homicide1990, clear
(S.Messner et al.(2000), U.S southern county homicide rates in 1990)

.
. spset
  Sp dataset homicide1990.dta
                data:  cross sectional
     spatial-unit id:  _ID
         coordinates:  _CX, _CY (planar)
    linked shapefile:  homicide1990_shp.dta

.
. spmatrix create contiguity W if _n<=500
  weighting matrix in W contains 2 islands

.
. spgenerate W_gini = W*gini
_IDs in weighting matrix W do not match _IDs in estimation sample
    There are places in W not in estimation sample and places in estimation sample not in W.
r(459);

.
. spgenerate W_gini = W*gini if _n<=500

Last edited by Andrew Musau; 13 Aug 2020, 01:06.

Comment

Lom Newton

Join Date: Jun 2019

Posts: 46
#6

13 Aug 2020, 22:18

Hello Andrew Musau, I think you may be missing the point that this I am dealing with a panel data. With a single cross-sectional data, your suggestion would have resolved the error/problem.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10194
#7

14 Aug 2020, 03:03

Does that matter? Provide a data example for any further input from me.
Comment

Lom Newton

Join Date: Jun 2019
Posts: 46

12 Sep 2020, 20:52

Hello Andrew Musau, so I tried using the panel version of homicide data you used. After a number of tries, below is what I came up with.

Code:

. copy https://www.stata-press.com/data/r16/homicide_1960_1990.dta .

. copy https://www.stata-press.com/data/r16/homicide_1960_1990_shp.dta .


. use homicide_1960_1990
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)


. xtset _ID year
       panel variable:  _ID (strongly balanced)
        time variable:  year, 1960 to 1990, but with gaps
                delta:  1 unit

. spset
  Sp dataset homicide_1960_1990.dta
                data:  panel
     spatial-unit id:  _ID
             time id:  year (see xtset)
         coordinates:  _CX, _CY (planar)
    linked shapefile:  homicide_1960_1990_shp.dta


. spmatrix create contiguity W if year == 1990


. spgenerate Wue = W*unemployment
_IDs in weighting matrix W do not match _IDs in estimation sample
    There are places in W not in estimation sample and places in estimation sample not in W.
r(459);

** So I decided to create the lag variable for each year

. spgenerate W_ue60 = W*unemployment if year == 1960

. 
. spgenerate W_ue70 = W*unemployment if year == 1970

. 
. spgenerate W_ue80 = W*unemployment if year == 1980

. 
. 
. spgenerate W_ue90 = W*unemployment if year == 1990

***Then I now generate W_ue as zeros and later fill it as follows :

. gen W_ue = 0

. replace W_ue = W_ue60 if year == 1960
(1,412 real changes made)


. replace W_ue = W_ue70 if year == 1970
(1,412 real changes made)

. replace W_ue = W_ue80 if year == 1980
(1,412 real changes made)

. replace W_ue = W_ue90 if year == 1990
(1,412 real changes made)

.drop W_ue60 W_ue70 W_ue80 W_ue90 
.

Now while this achieves what I wanted in creating the manual spatial lag W_ue, it is extremely laborious and very inefficient in the sense that I am manually lagging about 15 variables in my data set. Is it possible to have this done in a much efficient manner?

Thanks.

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10194

13 Sep 2020, 08:34

Code:

use homicide_1960_1990
xtset _ID year
spset
spmatrix create contiguity W if year == 1990
levelsof year, local(years)
foreach year in `years'{
    spgenerate XYZ`year' = W*unemployment if year==`year'
}
egen W_ue= rowmax(XYZ*)
drop XYZ*

Comment

Lom Newton

Join Date: Jun 2019

Posts: 46
#10

13 Sep 2020, 13:11

Thanks Andrew Musau.
Comment

Announcement

Error using spgenerate

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment