ppml_panel_sg --- large number of non-missing observations dropped

Sultan Orazbayev

Join Date: Nov 2016

Posts: 4
#1

ppml_panel_sg --- large number of non-missing observations dropped

10 Nov 2016, 12:55

Hello,

I am using user-written ppml_panel_sg , which is just what I need for my gravity analysis.

However, in the process of calculation:

Code:

ppml_panel_sg flow var1, exporter(id_orig) importer(id_dest) year(year) cluster(id_dyad) olsguess

I get about a quarter of my sample dropped, even though none of the data in my file is missing.

Code:

X obs. dropped because they belong to groups with all zeros or missing values

Can someone give me an example/explanation of how I can identify these dropped observations? I tried using e(sample), but that doesn't seem to work.
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#2

10 Nov 2016, 13:27

Dear Sultan,

I understand that the author will soon update this command to deal with some bugs. Maybe you can contact him directly and ask in to incorporate that feature.

Best wishes,

Joao
Comment
Sultan Orazbayev

Join Date: Nov 2016

Posts: 4
#3

10 Nov 2016, 23:17

Dear Joao, thank you for the suggestion! I contacted the author and will post back with the results.
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#4

11 Nov 2016, 01:53

Hi Sultan,

Thanks for the email and very glad you are finding ppml_panel_sg to be useful. There is a simple answer to your question, but, since this may not be clear to others I will post it here.

What the message you quoted is saying is that there are fixed effects in your specification that are not associated with any positive observations. Thus, technically speaking, it is not actually possible to estimate a fixed effect for these observations.

In your case, your fixed effects are origin and destination-specific. So the observations which are dropped are all those which are associated with origins or destinations in your data who have no positive values for "flow".

Really hoped that helps! As Joao mentioned, I will be posting a new update soon that enables you to use e(sample), among other things I want to address.

Tom
Comment
Sultan Orazbayev

Join Date: Nov 2016

Posts: 4
#5

11 Nov 2016, 03:51

Dear Tom, thank you for this clarification! I will be waiting for the update.
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#6

17 Nov 2016, 17:02

For those interested, a new update of ppml_panel_sg is now available via ssc. You can install the new version by typing "ssc install ppml_panel_sg, replace". Among other things, it will now allow you to use e(sample).

Edited to add: the new version should also address the issues mentioned in another thread.

In particular, if you experience either of the following error messages

Code:

selectidx13(): 3001 expected 1 arguments but received 2

or

Code:

variable year is not found

these should be resolved by re-installing the command.

Last edited by Tom Zylkin; 17 Nov 2016, 17:23.
1 like
Comment
Tan Li

Join Date: May 2018

Posts: 2
#7

06 May 2018, 21:41

Hello,

Q1. I have similar problems when using ppml_panel_sg. I use Stata 14 and have updated by typing "ssc install ppml_panel_sg, replace".

Code:
ppml_panel_sg y RTA, ex(idi) im(idj) y(year)

I get about half of my sample dropped. The message shows
"1943 obs. dropped because they belong to groups with all zeros or missing values"

However, I checked my data. I have no missing values of y, and no group with all zeros. Could someone explain why some obs. are dropped? Is it random?

Q2. I then use the ppml command to double check.
I first gen year dummy and country-pair dummy by the following code:

tab year, gen(year_dum)
tab idij, gen (idij_dum)

Then I use the ppml code:

ppml y RTA year_dum* idij_dum*

In this case, I got 228 dummy regressors dropped to ensure that the estimates exist. Also, more than half of obs. are dropped. But from my basic understanding from the "log of gravity" paper, should the ppml be able to deal with large number of zeros?
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#8

07 May 2018, 02:05

Dear Tan Li

I'll let the author of ppml_panel_sg comment on Q1, but certainly the observations are not dropped randomly. Most likely, these are observations that do not contain information about the parameters of interest and are dropped because of that.

On Q2, PPML certainly has no problem dealing with large numbers of zeros. Again, the observations dropped are not informative about the parameters of interest and are dropped because of that. In short, do not worry about these because dropping these observations does not create any problem.

Joao
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#9

07 May 2018, 05:39

Hi Tan Li,

I can certainly second what Joao says about PPML having no problem with dealing with large numbers of zeros! Most likely you have either some pairs of countries that never trade with one another in your data. Because you have pair fixed effects in your specifications, the zeros for these pairs can be thought of as perfectly predicted, such that they add no information about the parameters of interest, exactly as Joao says above for the other example.

In addition, the first-order conditions from the estimation indicate the conditional mean for these observations should be zero, which is not an admissible value (it requires the fixed effects associated with these pairs to go to negative infinity.)

I would be happy to look at your data if you don't believe the above explanation is correct. Just let me know.

Regards,
Tom
Comment
Tan Li

Join Date: May 2018

Posts: 2
#10

07 May 2018, 19:56

Dear Joao and Tom, thank you very much for your clarification. I am relieved now that I do not need to worry about those dropped obs.
Comment

Announcement

ppml_panel_sg --- large number of non-missing observations dropped

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment