The Command keep doesn't apply of the estimation process

Anat Tchetchik

Join Date: Jun 2014
Posts: 217

The Command keep doesn't apply of the estimation process

19 Oct 2024, 10:07

Hi all,
I have run a model xtabond2 model with bootstrapping (using Stata 17). Since some my panels have small T I have limited my sample to to observation with a min of 8 obs. in the relevant variables (using egen for observation counting and a filter) I finally ask Stata to:

Code:

Keep if filter_obs_8==1

. However, when I run the model, I see that the

Obs per group: min = 2

. How can this be possible?
Below is the code:

Code:

egen obs_count_ERP = count(l_ERP_interp), by(Country1)
egen obs_count_l_ERP = count(l_l_ERP_interp), by(Country1)
egen obs_count_Corr_hp = count(Corr_hp_interp), by(Country1)
egen obs_count_WGI = count(WGI_interp), by(Country1)
egen obs_count_GPI = count(GPI_interp), by(Country1)
egen obs_count_Inflation = count(l_Inflation_interp), by(Country1)
egen obs_count_GDPcapita = count(l_GDPcapita_interp), by(Country1)
egen obs_count_GDPgrw = count(GDPgrw_interp), by(Country1)

gen filter_obs_8 = (obs_count_ERP >=8 & obs_count_l_ERP>=8 & obs_count_GDPgrw >=8 &  obs_count_Corr_hp >= 8 & obs_count_WGI >= 8 & obs_count_GPI >= 8 & obs_count_Inflation >= 8 & obs_count_GDPcapita >=8)

*This is for keeping only observation with no gaps in the data 
generate sample=1-missing(l_ERP_interp, l_l_ERP_interp ,l_Inflation_interp ,l_GDPcapita_interp, GPI_interp, Corr_hp_interp,  GDPgrw_interp, WGI_interp)

keep if sample
generate sample1= sample if filter_obs_8==1
keep if sample1

xtset Country1 year
generate newid = Country1
xtset newid year

bootstrap _b, rep(50) seed(12345) cluster(Country1) idcluster(newid):xtabond2  l_ERP_interp l_l_ERP_interp l_GDPcapita_interp Corr_hp GPI_interp l_Inflation_interp GDPgrw_interp WGI_interp i.year, gmm( l_l_ERP_interp , lag(2 4) collapse) gmm(l_GDPcapita_interp GDPgrw_interp l_Inflation_interp WGI_interp , lag(1 2) collapse) iv( GPI_interp Corr_hp i.year, equation(level)) robust

Click image for larger version

Name: Picture1.png
Views: 1
Size: 9.6 KB
ID: 1766067

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

19 Oct 2024, 10:40

The logic underlying your code is faulty.

Code:

gen filter_obs_8 = (obs_count_ERP >=8 & obs_count_l_ERP>=8 & obs_count_GDPgrw >=8 & obs_count_Corr_hp >= 8 & obs_count_WGI >= 8 & obs_count_GPI >= 8 & obs_count_Inflation >= 8 & obs_count_GDPcapita >=8)

identifies countries that have 8 observations with non missing values of ERP and 8 non-missing values of I_ERP and 8 non-missing values of GDPgrw, etc. But it doesn't follow that the accepted observations will have 8 observations with non-missing values for all of those variables. The 8 observations with non-missing values of ERP might have missing values of I_ERP, and the 8 observations with non-missing values of I_ERP might have missing values of ERP, and so on. It is entirely possible that there would be no observations that have complete non-missing data on all of these variables even though, for each variable separately, you can find 8 observations that have non-missing values.

I would replace this with something like:

Code:

egen byte mcount = rowmiss(ERP I_ERP GDPgrw CORRhp WGI GPI Inflation GDPcapita) by country, sort: egen complete_data_obs = total(mcount == 0) by country: keep if complete_data_obs >= 8
1 like
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9945

19 Oct 2024, 10:52

Another way to identify complete observations:

Code:

quietly regress ERP I_ERP GDPgrw CORRhp WGI GPI Inflation GDPcapita
gen sample= e(sample)
bys country (sample): keep if _N>=8 & sample

Comment

Anat Tchetchik

Join Date: Jun 2014

Posts: 217
#4

19 Oct 2024, 11:01

Thank you very much Andrew Musau and Clyde Schechter !
Comment

Announcement

The Command keep doesn't apply of the estimation process

Comment

Comment

Comment