Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • First-difference Estimator Panel data

    I have a panel data set for 31 Chinese provinces with 6 records for 6 5-year periods which include the log of temperature, log of precipitation, and log of outmigration rate for each of them (1990-1995, 1995-2000, 2000-2005, 2005-2010, 2010-2015, 2015-2020).

    I would like to estimate a first-difference equation to see if the coefficients are consistent with what I got in the fixed effects model, which is
    Code:
    eststo: reghdfe lnomr lntemp lnprecip, a(province year) cl(province)
    .

    I tried the code:
    Code:
    reghdfe s5.(lnomr lntemp lnprecip), cl(province)
    and the coefficients do not appear to be the same. i think that is because I didn't include the differences of time-fixed effects.

    It might be a simple question, but it's bogging me down. Any ideas about how I can include them to fix my code?

    Thank you in advance.

  • #2
    reghdfe is from https://github.com/sergiocorreia/reghdfe (FAQ Advice #12). The difference operator is "D"

    Code:
    help tsvarlist
    Time effects can be included as indicators in a linear model, but your specification is confusing. If the time variable represents 5-year ranges, why are you using the year as the time variable? You should first aggregate the data if you haven't done so. Indicators are entered in the FD regression as they are; there is no special treatment for them.

    Comment


    • #3
      Thank you for replying! In my dataset, I write 1990 to stand for 1990-1995, 1995 to stand for 1995-2000. Therefore, the year variable in my dataset is 1990, 1995, 2000, 2005, 2010 and 2015. As I xtset province year, stata takes year as time variable but with gaps. Meaning, it thinks 1990=n, 1991=n+1, 1992=n+2. Therefore, to get the difference, I need to take s5.

      Comment


      • #4
        Originally posted by Kehan Yan View Post
        Thank you for replying! In my dataset, I write 1990 to stand for 1990-1995, 1995 to stand for 1995-2000. Therefore, the year variable in my dataset is 1990, 1995, 2000, 2005, 2010 and 2015.
        That would work with S5.var, although it is less complicated to create a period variable:

        Code:
        egen period= group(year)
        xtset province period
        or specify the units in xtset

        Code:
        xtset province year, delta(5)
        and then use "D.".
        Last edited by Andrew Musau; 22 Apr 2024, 17:22.

        Comment


        • #5
          Thanks! I changed xtset using
          Code:
            
           xtset province year, delta(5)
          Here is the fixed effect model I run :
          Code:
           eststo: reghdfe lnomr lntemp lnprecip, a(province year) cl(province)
          Result being
          Code:
          HDFE Linear regression                            Number of obs   =        185
          Absorbing 2 HDFE groups                           F(   2,     30) =       0.38
          Statistics robust to heteroskedasticity           Prob > F        =     0.6854
                                                            R-squared       =     0.8801
                                                            Adj R-squared   =     0.8499
                                                            Within R-sq.    =     0.0042
          Number of clusters (province) =         31        Root MSE        =     0.2605
          
                                        (Std. err. adjusted for 31 clusters in province)
          ------------------------------------------------------------------------------
                       |               Robust
                 lnomr | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                lntemp |  -2.234302   5.415106    -0.41   0.683    -13.29342    8.824819
              lnprecip |   .2853823   .5567488     0.51   0.612    -.8516504    1.422415
                 _cons |   3.254813   23.32335     0.14   0.890    -44.37782    50.88745
          ------------------------------------------------------------------------------
          And here is the FD equation.
          Code:
           eststo: reghdfe D.(lnomr lntemp lnprecip), a( year) cl(province)
          Code:
          HDFE Linear regression                            Number of obs   =        154
          Absorbing 1 HDFE group                            F(   2,     30) =       0.79
          Statistics robust to heteroskedasticity           Prob > F        =     0.4644
                                                            R-squared       =     0.5648
                                                            Adj R-squared   =     0.5471
                                                            Within R-sq.    =     0.0144
          Number of clusters (province) =         31        Root MSE        =     0.2855
          
                                        (Std. err. adjusted for 31 clusters in province)
          ------------------------------------------------------------------------------
                       |               Robust
               D.lnomr | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                lntemp |
                   D1. |   4.950594   3.948599     1.25   0.220    -3.113519    13.01471
                       |
              lnprecip |
                   D1. |   .2227122   .4386444     0.51   0.615    -.6731192    1.118544
                       |
                 _cons |   .2335108   .0236466     9.88   0.000      .185218    .2818035
          ------------------------------------------------------------------------------
          The results are not the same. What might be wrong with my FD code?
          Last edited by Kehan Yan; 22 Apr 2024, 17:37.

          Comment


          • #6
            Technically, you should absorb the differenced time indicators and not the indicators themselves, but I think this does not matter (the coefficients on the time-varying variables stay the same). For the former:

            Code:
            xi: reghdfe D.(lnomr lntemp lnprecip i.year), nocons noabsorb cl(province)
            Due to your unbalanced dataset, the FD estimator may be inefficient because it loses more observations due to missing values. Nonetheless, in the case of T=2, both FD OLS and FE will yield the same coefficients. However, for T>2, the point estimates will differ, but qualitatively, with a sufficiently large dataset, the results should align. Note that you do not reject the hypothesis that both sets of coefficients are zero in both the FE and FD regression in #5.
            Last edited by Andrew Musau; 22 Apr 2024, 18:32.

            Comment


            • #7
              Yes, this is exactly what I am looking for!

              Then I tried to write a similar equation for this FE model:
              Code:
              reghdfe lnomr c.(lntemp lnprecip)##ib3.rank1, a(province year) cl(province)
              I tried to write the FD in a compact way like
              Code:
               reghdfe D.lnomr (D.lntemp D.lnprecip )##ib3.rank1 , nocons a(year) cl(province)
              or by generating the difference variables first:
              Code:
              reghdfe dlnomr dlntemp#ib3.rank1 dlnprecip#ib3.rank1 , nocons a(year) cl(province)
              In both equations, I absorb year because I don't know how to include the differenced time indicators in this case (but I still prefer to include the differenced time indicators). They all return the error messages like: invalid interaction specification;
              the 'D' operator is not allowed with factor variables

              How can I fix the code?

              Comment


              • #8
                You can create the differenced variables manually and use factor variable notation, or look at how to include interactions using the -xi:- prefix.

                Code:
                help xi

                Comment


                • #9
                  Sure I will, thank you!

                  Comment

                  Working...
                  X