Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ppml_panel_sg --- large number of non-missing observations dropped

    Hello,

    I am using user-written ppml_panel_sg , which is just what I need for my gravity analysis.

    However, in the process of calculation:

    Code:
    ppml_panel_sg flow var1, exporter(id_orig) importer(id_dest) year(year) cluster(id_dyad) olsguess
    I get about a quarter of my sample dropped, even though none of the data in my file is missing.

    Code:
    X obs. dropped because they belong to groups with all zeros or missing values
    Can someone give me an example/explanation of how I can identify these dropped observations? I tried using e(sample), but that doesn't seem to work.




  • #2
    Dear Sultan,

    I understand that the author will soon update this command to deal with some bugs. Maybe you can contact him directly and ask in to incorporate that feature.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao, thank you for the suggestion! I contacted the author and will post back with the results.

      Comment


      • #4
        Hi Sultan,

        Thanks for the email and very glad you are finding ppml_panel_sg to be useful. There is a simple answer to your question, but, since this may not be clear to others I will post it here.

        What the message you quoted is saying is that there are fixed effects in your specification that are not associated with any positive observations. Thus, technically speaking, it is not actually possible to estimate a fixed effect for these observations.

        In your case, your fixed effects are origin and destination-specific. So the observations which are dropped are all those which are associated with origins or destinations in your data who have no positive values for "flow".

        Really hoped that helps! As Joao mentioned, I will be posting a new update soon that enables you to use e(sample), among other things I want to address.

        Tom

        Comment


        • #5
          Dear Tom, thank you for this clarification! I will be waiting for the update.

          Comment


          • #6
            For those interested, a new update of ppml_panel_sg is now available via ssc. You can install the new version by typing "ssc install ppml_panel_sg, replace". Among other things, it will now allow you to use e(sample).

            Edited to add: the new version should also address the issues mentioned in another thread.

            In particular, if you experience either of the following error messages

            Code:
            selectidx13(): 3001 expected 1 arguments but received 2
            or

            Code:
            variable year is not found
            these should be resolved by re-installing the command.
            Last edited by Tom Zylkin; 17 Nov 2016, 17:23.

            Comment


            • #7
              Hello,

              Q1. I have similar problems when using ppml_panel_sg. I use Stata 14 and have updated by typing "ssc install ppml_panel_sg, replace".

              Code:
              ppml_panel_sg y RTA, ex(idi) im(idj) y(year)

              I get about half of my sample dropped. The message shows
              "1943 obs. dropped because they belong to groups with all zeros or missing values"

              However, I checked my data. I have no missing values of y, and no group with all zeros. Could someone explain why some obs. are dropped? Is it random?

              Q2. I then use the ppml command to double check.
              I first gen year dummy and country-pair dummy by the following code:

              tab year, gen(year_dum)
              tab idij, gen (idij_dum)

              Then I use the ppml code:

              ppml y RTA year_dum* idij_dum*

              In this case, I got 228 dummy regressors dropped to ensure that the estimates exist. Also, more than half of obs. are dropped. But from my basic understanding from the "log of gravity" paper, should the ppml be able to deal with large number of zeros?


              Comment


              • #8
                Dear Tan Li

                I'll let the author of ppml_panel_sg comment on Q1, but certainly the observations are not dropped randomly. Most likely, these are observations that do not contain information about the parameters of interest and are dropped because of that.

                On Q2, PPML certainly has no problem dealing with large numbers of zeros. Again, the observations dropped are not informative about the parameters of interest and are dropped because of that. In short, do not worry about these because dropping these observations does not create any problem.

                Joao

                Comment


                • #9
                  Hi Tan Li,

                  I can certainly second what Joao says about PPML having no problem with dealing with large numbers of zeros! Most likely you have either some pairs of countries that never trade with one another in your data. Because you have pair fixed effects in your specifications, the zeros for these pairs can be thought of as perfectly predicted, such that they add no information about the parameters of interest, exactly as Joao says above for the other example.

                  In addition, the first-order conditions from the estimation indicate the conditional mean for these observations should be zero, which is not an admissible value (it requires the fixed effects associated with these pairs to go to negative infinity.)

                  I would be happy to look at your data if you don't believe the above explanation is correct. Just let me know.

                  Regards,
                  Tom

                  Comment


                  • #10
                    Dear Joao and Tom, thank you very much for your clarification. I am relieved now that I do not need to worry about those dropped obs.

                    Comment

                    Working...
                    X