Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Steps to run a unbalanced panel data regression in Stata?

    Hi everyone,
    I'm a newbie. Please share some tips to run a panel data regression. My data set is unbalanced panel, N= 1,497, t = 5 from 2017-2021. I usually do as follows:
    1/ Run pool model: reg y x1 x2...
    2/check multicollinear for pool: vif
    3/check heteros for pool: imtest, white
    4/ check serial correlation for pool: xtserial y x1 x2...
    If pool model has heteros and serial correlation we have to choose FEM or REM:
    5/ Choose FEM or REM:
    xtreg, fe --> est sto fe
    xtreg, re --> est sto re
    hausman fe re
    6/ Check heteros for Fem by xttest3 or Rem by xttest0
    7/ Check serial correlation on FEM or REM by xtserial y x1 x2 ...
    My questions are:
    Q1/ Is my process correct?
    Q2/ If there are heteros and serial correlation on FEM or REM, how can I correct them on my unbalanced panel?
    Thanks so so much.

  • #2
    Piano:
    welcome to this forum.
    Your before take off check list should be tweaked a bit (no matter if the pane is balanced or not; forget this issue, as it is immaterial for Stata):
    1) Choose FEM or REM:
    a) xtreg, fe --> est sto fe
    b) xtreg, re --> est sto re
    2) Check heteros for Fem by xttest3 or Rem by xttest0
    3) Check serial correlation on FEM or REM by xtserial y x1 x2 ..
    4) if test are negative: go -hausman-;
    5) if test are positive vs nuisance(s): invoke -robust- or -vce(cluster panelid) options (they are equivalent under -xtreg-);
    6) re-run -fe- and -re-;
    7) check -xtreg,re- via the community-contributed module -xtoverid-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo, thanks so much for your response. I am so sorry if my questions are basic, but I get confused a lot.
      Q1/ Is there any reason that pooled OLS is unnecessary in the very first steps of running panel data? While so many instructors guide us to run OLS first, then FEM, then REM...
      Q2/ When I follow your guidance, could I use the result of step 5 (xtreg y x1 x2..., robust fe) into my report? or do I have to check or run something more please guide me (b/c I already check heteros and serial correlation in step 2 and 3, I don't understand what to do next)

      Originally posted by Carlo Lazzaro View Post
      Piano:
      welcome to this forum.
      Your before take off check list should be tweaked a bit (no matter if the pane is balanced or not; forget this issue, as it is immaterial for Stata):
      1) Choose FEM or REM:
      a) xtreg, fe --> est sto fe
      b) xtreg, re --> est sto re
      2) Check heteros for Fem by xttest3 or Rem by xttest0
      3) Check serial correlation on FEM or REM by xtserial y x1 x2 ..
      4) if test are negative: go -hausman-;
      5) if test are positive vs nuisance(s): invoke -robust- or -vce(cluster panelid) options (they are equivalent under -xtreg-);
      6) re-run -fe- and -re-;
      7) check -xtreg,re- via the community-contributed module -xtoverid-.

      Comment


      • #4
        Piano:
        1) My guess is that POLS represents the tool that bridges the cross-sectional and the panel realm (and under given conditions POLS is still a valid estimator for panel data regression):
        2) if you detect heteroskedasticity and/or autocorrelation in idiosyncratic residuals, just invoke -robust- or -vce(cluster clusterid)- options, so that standard errors can take this/these nuisance(s) into account;
        3) non-default standard errors impose you to switch from -hausman- to the community-contributed module -xtoverid- to choose between -fe- and -re-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks so so much Carlo, things are more apparent right now. I will follow your guidance and find out more.
          Sincerely thanks, your contributions to the community are so splendid

          Originally posted by Carlo Lazzaro View Post
          Piano:
          1) My guess is that POLS represents the tool that bridges the cross-sectional and the panel realm (and under given conditions POLS is still a valid estimator for panel data regression):
          2) if you detect heteroskedasticity and/or autocorrelation in idiosyncratic residuals, just invoke -robust- or -vce(cluster clusterid)- options, so that standard errors can take this/these nuisance(s) into account;
          3) non-default standard errors impose you to switch from -hausman- to the community-contributed module -xtoverid- to choose between -fe- and -re-.

          Comment

          Working...
          X