Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Shelly:
    please note that Carlo is enough for me. Thanks.
    That said:
    1) have you already checked the collinearity of your categorical variables via -estat vce,corr- after -xtreg,re-?
    2) you can check the functional form mispecification of your regression (that, under more general conditions, can be read as a test of model msspecification at large) following an approach similar to the one detailed in -linktest- entry, Stata .pdf manual:
    Code:
    use "https://www.stata-press.com/data/r16/nlswork.dta"
    . xtreg ln_wage c.age##c.age, re
    
    Random-effects GLS regression                   Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-sq:                                           Obs per group:
         within  = 0.1087                                         min =          1
         between = 0.1015                                         avg =        6.1
         overall = 0.0870                                         max =         15
    
                                                    Wald chi2(2)      =    3388.51
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0590339   .0027172    21.73   0.000     .0537083    .0643596
                 |
     c.age#c.age |  -.0006758   .0000451   -15.00   0.000    -.0007641   -.0005876
                 |
           _cons |   .5479714   .0397476    13.79   0.000     .4700675    .6258752
    -------------+----------------------------------------------------------------
         sigma_u |   .3654049
         sigma_e |  .30245467
             rho |  .59342665   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . predict fitted, xb
    (24 missing values generated)
    
    . gen sq_fitted=fitted^2
    (24 missing values generated)
    
    *Augmented regression*
    
    . xtreg ln_wage c.age##c.age fitted sq_fitted , re
    note: c.age#c.age omitted because of collinearity
    
    Random-effects GLS regression                   Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-sq:                                           Obs per group:
         within  = 0.1105                                         min =          1
         between = 0.1039                                         avg =        6.1
         overall = 0.0888                                         max =         15
    
                                                    Wald chi2(3)      =    3459.51
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0166047   .0024441     6.79   0.000     .0118144     .021395
                 |
     c.age#c.age |          0  (omitted)
                 |
          fitted |   6.745315   .7234634     9.32   0.000     5.327352    8.163277
       sq_fitted |  -2.009945   .2520254    -7.98   0.000    -2.503906   -1.515985
           _cons |  -4.445486   .5624869    -7.90   0.000     -5.54794   -3.343032
    -------------+----------------------------------------------------------------
         sigma_u |  .36492262
         sigma_e |  .30215307
             rho |  .59327076   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    *Ancillary regression*
    
    . xtreg ln_wage fitted sq_fitted , re
    
    Random-effects GLS regression                   Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-sq:                                           Obs per group:
         within  = 0.1088                                         min =          1
         between = 0.1045                                         avg =        6.1
         overall = 0.0887                                         max =         15
    
                                                    Wald chi2(2)      =    3407.81
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          fitted |   2.805959   .4327827     6.48   0.000      1.95772    3.654197
       sq_fitted |  -.5516341   .1320951    -4.18   0.000    -.8105358   -.2927324
           _cons |  -1.468083   .3527217    -4.16   0.000    -2.159405   -.7767613
    -------------+----------------------------------------------------------------
         sigma_u |  .36481589
         sigma_e |  .30242516
             rho |  .59269507   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    .
    As sq_fitted coefficient reaches statistical significance no matter the approach, the model is misspecified (and deliberately so).
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #17
      Carlo Lazzaro looks like there is misspecification in the model. I tried running the commands as suggested above and sq_fitted in my model came out to be significant. Sharing the final result here
      Click image for larger version

Name:	reg.PNG
Views:	1
Size:	20.3 KB
ID:	1641151

      Comment


      • #18
        Shelly:
        usually, the result you reported is due to missing predictors and/or missing interactions among predictors that should give a fair and true view of the data generating process you're investigating.
        In addition, have you already checked the collinearity of your categorical variables via -estat vce,corr- after -xtreg,re-, as it could hide a concomitant cause of the same issue.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #19
          Carlo Lazzaro Yes I checked that already

          Click image for larger version

Name:	reg.PNG
Views:	1
Size:	22.8 KB
ID:	1641297

          Comment


          • #20
            Should I try using PPML method? Because there will be endogeneity issues in this model

            Comment


            • #21
              Shelly:
              some correlations among -intra- and -extra- prefixed predictors look high.
              I would investigate them a bit further to decide whether all of them should be plugged in the right-hand side of your regression equation.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment

              Working...
              X