Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multicollinearity with time-variable

    Hello, I am running a an OLS-regression with fixed effects once again. As I have heterogeneity I am using clustered standard errors with VCE(cluster, country). I do find a slightly significant relation (see results below).



    xtreg deltaG vdem, fe vce(cluster countrycode)

    Fixed-effects (within) regression Number of obs = 1,297
    Group variable: countrycode Number of groups = 59

    R-squared: Obs per group:
    Within = 0.0154 min = 21
    Between = 0.2512 avg = 22.0
    Overall = 0.1797 max = 22

    F(1,58) = 2.84
    corr(u_i, Xb) = -0.5960 Prob > F = 0.0972

    (Std. err. adjusted for 59 clusters in countrycode)
    ------------------------------------------------------------------------------
    | Robust
    deltaG | Coefficient std. err. t P>|t| [95% conf. interval]
    -------------+----------------------------------------------------------------
    vdem | -.0347145 .0205893 -1.69 0.097 -.0759285 .0064994
    _cons | .1028449 .015274 6.73 0.000 .0722706 .1334192
    -------------+----------------------------------------------------------------
    sigma_u | .03547219
    sigma_e | .0144267
    rho | .85806819 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------



    It just occurred to me now, that I correct for any possible time-trends. Now I am wondering exactly how to do this. I was figuring I would just include my time-variable in the regression. Doing this, I achieve the following results:



    xtreg deltaG vdem year, fe vce(cluster countrycode)

    Fixed-effects (within) regression Number of obs = 1,297
    Group variable: countrycode Number of groups = 59

    R-squared: Obs per group:
    Within = 0.1466 min = 21
    Between = 0.2513 avg = 22.0
    Overall = 0.0010 max = 22

    F(2,58) = 13.30
    corr(u_i, Xb) = -0.2123 Prob > F = 0.0000

    (Std. err. adjusted for 59 clusters in countrycode)
    ------------------------------------------------------------------------------
    | Robust
    deltaG | Coefficient std. err. t P>|t| [95% conf. interval]
    -------------+----------------------------------------------------------------
    vdem | -.0133449 .0198895 -0.67 0.505 -.0531581 .0264683
    year | .0008292 .0001652 5.02 0.000 .0004984 .0011599
    _cons | -1.580013 .3340496 -4.73 0.000 -2.248685 -.9113397
    -------------+----------------------------------------------------------------
    sigma_u | .03329454
    sigma_e | .01343645
    rho | .85994639 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------



    Thus, significance obviously does not hold, correcting for time trends. I am however a bit puzzled as I figure, this might be due to multicollinearity?
    I run the command to achieve the variance inflation factor:



    reg deltaG vdem year i.countrycode, vce(cl countrycode)
    vif


    I am then presented with the results:

    Variable | VIF 1/VIF
    -------------+----------------------
    vdem | 12.88 0.077612
    year | 1.04 0.95729




    It thus seems I do have multicollinearity with my time variable. How should I go about this problem? I am quite perplexed as I am quite certain the specification is right.

    Thank you very much in advance!

  • #2
    Jannik
    a) a panel data regression with one predictor only is not informative (as your model is in, all likelihood, misspecified);
    b) usually, in -xtreg,fe- the time variable is included as a categorical predictor in the right-hand side of the regression equation;
    c) if, for any reason, you want to check if a non-linear relationship exists between -year- and your regressand, you should include a quadratic term, too:
    Code:
    xtreg deltaG vdem c.year##c.year, fe vce(cluster countrycode)
    d) instead of switching to -regress- you can check the correlation between the coefficìents of your -xtreg,fe- via -estat, vce corr-.
    e) multicolinearity: see Chaoter 23 in https://www.hup.harvard.edu/catalog....40&content=toc.
    Last edited by Carlo Lazzaro; 08 May 2023, 10:59.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you very much Carlo!
      I do realize, that my coefficient of determination screams for an inclusion of more predictors, but right now I am merely making a descriptive step by step of the process of the statistical models and tests used throughout the analysis. Then I will include several more predictors afterwards. Though the end results, I expect not to be of much difference.
      When you state, that my model is in all likelihood misspecified, what do you mean? The model selected (xtreg, fe vce(cl panel)) should by all means be the correct approach to my data, as we discussed in the other thread, right?
      If by misspecification, you mean, that there in all likelihood, is no predictive value for my independent variable to the measure of redistribution, I believe you are quite right. Though it might seem contrainutitive, that is actually what I was expecting, and merely a step for me to proceed my analysis with other measures.

      Comment


      • #4
        Jannik:
        to be informative, each regression model should give a fair and true view of the data generating process.
        A step by step proceduprocedurn my opinion, less satisfying, unless you're exploring something totally new.
        That said, I meant, as you surmise, that your model needs more predictors.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X