Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with ppmlhdfe in a Gravity Model for Migration

    Hi! I´m and undergraduate student and I´m currently in the process of my final thesis. The thesis consists on a gravity model for migration and tries to explain if there are differences in the pull and push factorts whether the destination country is a developed or developing economy being the origin country only developing economies. For the origin country there are 120 countries and for destination only 2, which are Qatar and United States as a representative sample of developed and developing as they receive the largest quantity of migration.

    When I estimate the model with ppml, it told me that the variables where indeed to big, so I did use logs as my advisor also told me. The variables I´am currently using are TotalStockMigration as dependent variable and indepedent variables origin gdp, destination gdp, common religion, rta agreement, distance between the capitals and entry cost to start a business in the destination country. On the other hand, I created fixed pair effect and individuals to take into consideration with ppmlhdfe.

    The main problem is that when i run the following:ppmlhdfe lTotalStock ldistcap lgdpcap_o lgdpcap_d rta comrelig entry_cost_d,absorb(countpair year) cluster(countpair), it ommits distance and common religion, drops a lot of observations and most of them are not significant.

    But if ir run:
    ppmlhdfe lTotalStock ldistcap lgdpcap_o lgdpcap_d rta comrelig entry_cost_d,absorb(iso3_o year), is when is giving me the most coherent results but still both gdp´s are not significant at 95%.

    Click image for larger version

Name:	2022-04-21 (2).png
Views:	1
Size:	26.3 KB
ID:	1660831



    And if I add the cluster, then the gdp destination is significant but the origin not:

    Click image for larger version

Name:	2022-04-21 (3).png
Views:	2
Size:	30.7 KB
ID:	1660832


    Also, as the thesis tries to answer the question if the pull and push factors are different from when the migrate to a developing country o to a developed, I estimate the regressions by sorting the destination (Qatar or United States). So for that purpose, i try to estimate the following but the results does not seem to be really good:

    For QATAR: ppmlhdfe lTotalStock ldistcap lgdpcap_o lgdpcap_d rta comrelig entry_cost_d if iso3num_d==634,absorb(iso3_o year) cluster(iso3_o)
    Click image for larger version

Name:	2022-04-21 (6).png
Views:	1
Size:	26.6 KB
ID:	1660833



    It only gets 52 observations and most of them are omitted (it should get the half which are aprox 229) , I really do not know how to solve this.



    For UNITED STATES: ppmlhdfe lTotalStock ldistcap lgdpcap_o lgdpcap_d rta comrelig entry_cost_d if iso3num_d==840,absorb(iso3_o year) cluster(iso3_o)

    Click image for larger version

Name:	2022-04-21 (7).png
Views:	1
Size:	25.7 KB
ID:	1660834


    For United States, it gets all the obsrvations but again a lot of them have been omitted.


    If you could take a look and help me it would be great! Thankyou for your time and attention!
    Attached Files

  • #2
    Dear Amanda Ezzat,

    The first thing to note is that the dependent variable should not be in logs; if the values are large, just measure it in thousands or millions.

    The other problems you have are a consequence of your small dataset (especially having only 2 destinations) and of collinearity between the regressors and the fixed effects. For example, if you include pair fixed effects, you obviously cannot estimate the coefficient on the distance because that is collinear with the fixed effects. You need to think carefully about what variables you are going to include and what fixed effects you need.

    Finally, the observations that are dropped ate perfectly predicted by the model and therefore do not contribute to the estimation of the parameters. Again, this happens because there is little variation in the dependent variable and many fixed effects.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao Santos Silva,

      First of all, thankyou so much for taking the time to answer my doubts.
      I will definetly take your advice about rescalate the variables without taking logs.Next, with regard to the fixed effects, I thought that if I´am willing to prove if there are differences in the pull and push factors whether you migrate a developed country(United States) or a developing one (Qatar) the best option is to only include fiex effects for the origin countries so that it can eliminate the unique inobserved heterogeneity for each one so that there is not external contamination in the estimations and we can observe the real effect. So, if I add fixed effects for the destination countries I will kind of ruin that unique heterogeneity which is what I´m interested in observing and why does or does it not differ the pull and push factors whether migrating to one or another.

      About the variables, only the distance, gdp origin, gdp destination, common religion would be the ones I would be adding in my thesis for sure and want them to contribute to the estimation of the parameters and know their coefficients, so what would be the best aproach to do it? Right now, I´m trying to add to the database the unemployment, migration policy, some indicator for education (which I´am not sure as I do not have individuals that I observe and If it occurs to anyone what would be best to use, I would be really thankful) and some variables related to climate change (as in the last years, it has grown the climate migration.

      Lastly, I know that only having two destinations is kind of a source of trouble but, does it mean that the model could be poor or not feasible?

      Again, thanyou so much for your time and attention.

      Best wishes,
      Amanda


      Comment


      • #4
        Dear Amanda Ezzat,

        Since you are an undergraduate student, I suggest you talk to your adviser and discuss this with them. Your model will not be very reliable and you will certainly have to make some compromises, but to ensure a good grade your advisor must agree with your choices.

        Best wishes,

        Joao

        Comment

        Working...
        X