Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing negative dummies in panel data re

    Hi members!

    We ask assistance in reanalyzing our panel data to assess the hypothetical impact of a Sovereign Wealth Fund (SWF) on Sweden's financial balance sheet from 1993 to 2022.
    Our study aims to evaluate the effect of the SWF on the Swedish government's balance sheet, which we've designated as the dependent variable, with the SWF considered an independent variable.

    Our dataset includes three variations of the government's balance sheet:
    1. The "actual" balance sheet.
    2. The "actual balance sheet" plus a hypothetical SWF, assuming no SWF volatility.
    3. The "actual balance sheet" plus a hypothetical SWF, allowing SWF volatility.

    However, we're encountering issues with our findings. The dummy variables yield negative outcomes, while the SWF variable demonstrates positive effects.
    Additionally, we've observed that both sigma_u and rho are zero, which typically indicates analytical problems.

    Here is our STATA output from the regression:

    Click image for larger version

Name:	re GLS regression.png
Views:	1
Size:	28.7 KB
ID:	1743249


    We also did a check for multicollinearity:

    Click image for larger version

Name:	Vif test.png
Views:	1
Size:	8.0 KB
ID:	1743250


    Given the strong suggestion of multicollinearity from the dummy variables, we tried to drop these and test again.

    Click image for larger version

Name:	Vif test drop dummies.png
Views:	1
Size:	6.5 KB
ID:	1743251



    We're seeking your opinion on these results.
    Do you believe there are fundamental errors, or might we be overlooking important considerations?
    Specifically, what are your thoughts on the dummy variables? Are they contributing meaningful insights, or are they merely complicating the analysis?


  • #2
    Joakim:
    two comments about your query:
    1) you seem to have no goup-wise effect (sigma_u=0);
    2) you're dealing with a T>N panel dataset. I would leave -xtreg- and consider -xtregar- or -xtgls- instead.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Hey Carlo! Thanks for responding so quick.

      We ran you suggestions and got the following results, what do you think?

      Click image for larger version

Name:	xtregar.jpg
Views:	1
Size:	101.8 KB
ID:	1743372
      Click image for larger version

Name:	xtgls.jpg
Views:	1
Size:	92.0 KB
ID:	1743373



      Sincerely
      Joakim Eriksson

      Comment


      • #4
        Joakim:
        1) your models are not comparable due to the different structure of the epsilon correlation, that has bearing on standard errors;
        2) you have too few observations to perform a reliable panel data regression (in addition, your first regression does not show any evidence of group-wise effect). If you cannot increase your sample size, I would stick with a pooled OLS
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Hey Carlo, thanks for the answer.

          When we do a pooled OLS without the dummies we get this result. Does not look that good... What do you think?

          Click image for larger version

Name:	OLS pooled no dummies.jpg
Views:	1
Size:	59.2 KB
ID:	1743405



          We also run a pooled OLS regression with the dummies but that resulted in the same coefficient as the previous picture (xtgls).

          Comment


          • #6
            Joakim:
            1) the reported OLS is not pooled; it simply a (false, because yo have panel data, but Stata does not know it) cross-sectional OLS that takes heteroskedasticity (but not autocorrelation; -regress- and -xtreg- differ on this respect; see Stata .pdf manual related entries for more details);
            2) the equivalence of the coefficients estimated by -regress- and -xtgls- makes sense (the standard errors differ, though; please also note that clustering with 3 units does not make any sense):
            Code:
            . use "https://www.stata-press.com/data/r18/nlswork.dta"
            (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
            
            . regress ln_wage c.age##c.age if idcode<=3
            
                  Source |       SS           df       MS      Number of obs   =        39
            -------------+----------------------------------   F(2, 36)        =      7.26
                   Model |  1.48752894         2  .743764471   Prob > F        =    0.0022
                Residual |  3.68821002        36  .102450278   R-squared       =    0.2874
            -------------+----------------------------------   Adj R-squared   =    0.2478
                   Total |  5.17573896        38  .136203657   Root MSE        =    .32008
            
            ------------------------------------------------------------------------------
                 ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     age |   .2270112   .0721084     3.15   0.003     .0807687    .3732538
                         |
             c.age#c.age |  -.0035377   .0012214    -2.90   0.006    -.0060149   -.0010605
                         |
                   _cons |  -1.688438   1.024512    -1.65   0.108    -3.766245    .3893681
            ------------------------------------------------------------------------------
            
            . xtgls ln_wage c.age##c.age if idcode<=3
            
            Cross-sectional time-series FGLS regression
            
            Coefficients:  generalized least squares
            Panels:        homoskedastic
            Correlation:   no autocorrelation
            
            Estimated covariances      =         1          Number of obs     =         39
            Estimated autocorrelations =         0          Number of groups  =          3
            Estimated coefficients     =         3          Obs per group:
                                                                          min =         12
                                                                          avg =         13
                                                                          max =         15
                                                            Wald chi2(2)      =      15.73
            Log likelihood             = -9.349405          Prob > chi2       =     0.0004
            
            ------------------------------------------------------------------------------
                 ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     age |   .2270112   .0692795     3.28   0.001      .091226    .3627965
                         |
             c.age#c.age |  -.0035377   .0011735    -3.01   0.003    -.0058377   -.0012377
                         |
                   _cons |  -1.688438   .9843193    -1.72   0.086    -3.617669    .2407919
            ------------------------------------------------------------------------------
            3) again, your main problem is the too limited sample size, that basically makes any inferential procedure unfeasible.
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Hey Carlo, thank you for so much help.

              Do we run the pooled OLS correctly? What do you think?

              Click image for larger version

Name:	Pooled OLS .jpg
Views:	1
Size:	64.5 KB
ID:	1743480



              Is it feasible or should we just accept that this will not be possible because of our data limitation and abandon our analysis? Perhaps do an Difference-in-Difference?

              Thank you so much.
              Sincerely
              Joakim

              Comment


              • #8
                Joakim:
                1) your pooled OLS code is correct, but if the number of your panels is <30 you cannot safely cluster your standard errors (and this makes the pooled OLS unfeasible);
                2) therefore, if you cannot increase your sample size, you are better off with sticking with descriptive statistics, explaining in your research report why your dataset does not allow you to perform inferential statistics.
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment

                Working...
                X