I have a panel data set for 31 Chinese provinces with 6 records for 6 5-year periods which include temperature, precipitation, and outmigration rate for each of them (1990-1995, 1995-2000, 2000-2005, 2005-2010, 2010-2015, 2015-2020).
Question 1: I would like to regress the outmigration rate on temperature and precipitation to find out if outmigration has any correlation with the climate factors. Especially, I would like to see if the effects are different for provinces of different income levels. I divided provinces to 3 categories based on their GDP - poor (dummy g1), middle-income (dummy g2) and rich (g3). Outmigration rate, temprature, precipitation are all in natural log forms.
The two regressions are done with and without adding income-level dummies. Both of the R^2 turned out to be around 0.85 to 0.9. Are they too high? What could possibly go wrong and what method should I use to fix it? (I am suspecting it may because of common time trends between temperature, outmigration rate and precipitation, but don't know how to test it).
Question 2:
Furthermore, I am testing if a province that depends more on agriculture can be more likely to be affected by temperature and precipitation. Thus, I run this regression:
Here, agri is the dummy which equals one if a province is defined as agriculture dependent. I am expecting the results to be: coefficient on c.lntemp#i.g1 tells me the temperature effect for poor and not agri-dependent provinces. the coefficient for c.lntemp#i.g1#i.agri tells me the additional effect temperature have on emigration for poor provinces if they are also agricultural dependent. So on and so forth. However, I get this result:
Why there is be collinearity here? I haven't a clue. What's more, how can I let stata not show the result for 0.g1#c.temp but only for 1.g1# c.lntemp as I export the result using esttab (currently, both are reported and besomes annoyng as there are a lot of rows to delete when there are lots of dummies in my regression)?
Could anyone help me with these two questions, please? I would really appreciate it!!
Question 1: I would like to regress the outmigration rate on temperature and precipitation to find out if outmigration has any correlation with the climate factors. Especially, I would like to see if the effects are different for provinces of different income levels. I divided provinces to 3 categories based on their GDP - poor (dummy g1), middle-income (dummy g2) and rich (g3). Outmigration rate, temprature, precipitation are all in natural log forms.
Code:
reghdfe lnomr lntemp lnprecip, a(province year) cl(province) reghdfe lnomr lntemp c.lntemp#i.g1 c.lntemp#i.g2 lnprecip c.lnprecip#i.g1 c.lnprecip#i.g2, a(province year) cl(province)
Question 2:
Furthermore, I am testing if a province that depends more on agriculture can be more likely to be affected by temperature and precipitation. Thus, I run this regression:
Code:
reghdfe lnomr c.lntemp#i.g1 c.lntemp#i.g1#i.agri c.lntemp#i.g2 c.lntemp#i.g2#i.agri c.lntemp#i.g3 c.lntemp#i.g3#i.agri c.lnprecip#i.g1 c.lnprecip#i.g1#i.agri c.lnprecip#i.g2 c.lnprecip#i.g2#i.agri c.lnprecip#i.g3 c.lnprecip#g3#i.agri , a(province year) cl(province)
Code:
note: 1.g2#c.lntemp omitted because of collinearity note: 0b.g2#0b.agri#co.lntemp omitted because of collinearity note: 1o.g2#0b.agri#co.lntemp omitted because of collinearity note: 0b.g3#c.lntemp omitted because of collinearity note: 1.g3#c.lntemp omitted because of collinearity note: 0b.g3#0b.agri#co.lntemp omitted because of collinearity note: 1.g2#c.lnprecip omitted because of collinearity note: 0b.g2#0b.agri#co.lnprecip omitted because of collinearity note: 1o.g2#0b.agri#co.lnprecip omitted because of collinearity note: 0b.g3#c.lnprecip omitted because of collinearity note: 1.g3#c.lnprecip omitted because of collinearity note: 0b.g3#0b.agri#co.lnprecip omitted because of collinearity HDFE Linear regression Number of obs = 185 Absorbing 2 HDFE groups F( 10, 30) = 8.83 Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.8951 Adj R-squared = 0.8612 Within R-sq. = 0.1293 Number of clusters (province) = 31 Root MSE = 0.2505 (Std. err. adjusted for 31 clusters in province) ------------------------------------------------------------------------------------ | Robust lnomr | Coefficient std. err. t P>|t| [95% conf. interval] -------------------+---------------------------------------------------------------- g1#c.lntemp | 0 | -7.011179 6.152577 -1.14 0.263 -19.57642 5.55406 1 | -7.021618 8.863487 -0.79 0.434 -25.12327 11.08004 | g1#agri#c.lntemp | 0 1 | -.1939885 .3708059 -0.52 0.605 -.9512751 .5632981 1 1 | -2.915903 .4769751 -6.11 0.000 -3.890016 -1.94179 | g2#c.lntemp | 0 | 8.655452 6.368943 1.36 0.184 -4.351666 21.66257 1 | 0 (omitted) | g2#agri#c.lntemp | 0 1 | 0 (omitted) 1 1 | 0 (omitted) | g3#c.lntemp | 0 | 0 (omitted) 1 | 0 (omitted) | g3#agri#c.lntemp | 0 1 | 0 (omitted) 1 1 | 0 (empty) | g1#c.lnprecip | 0 | .066395 .5887866 0.11 0.911 -1.136068 1.268858 1 | -3.429277 1.359237 -2.52 0.017 -6.205209 -.6533449 | g1#agri#c.lnprecip | 0 1 | .139488 .2353011 0.59 0.558 -.3410609 .6200369 1 1 | 1.711133 .2695761 6.35 0.000 1.160586 2.261681 | g2#c.lnprecip | 0 | 1.649785 .7958186 2.07 0.047 .0245069 3.275064 1 | 0 (omitted) | g2#agri#c.lnprecip | 0 1 | 0 (omitted) 1 1 | 0 (omitted) | g3#c.lnprecip | 0 | 0 (omitted) 1 | 0 (omitted) | g3#agri#c.lnprecip | 0 1 | 0 (omitted) 1 1 | 0 (empty) | _cons | 7.405652 24.7511 0.30 0.767 -43.14283 57.95414 ------------------------------------------------------------------------------------
Could anyone help me with these two questions, please? I would really appreciate it!!
Comment