Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing linear relationship in panel model (after RESET)

    Hello everyone,

    I am performing a panel regression analysis and I would like to know if my model is correctly specified. As the first step, I manually replicated the Ramsey RESET test which gave me a Prob > chi2 = 0.0000, meaning that the model is not correctly specified. Here the question, once we get that, there is a way to test (for all variables simultaneously) if a variable should be included as a quadratic term? I have searched in previous posts and it was suggested to do some plots to get clues. However, in my case, the plots are not really informative. I attach here the code in case it may be useful.

    Thanks in advance to anyone who is willing to help.
    Best regards


    Code:
    . xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph population
    >  population_density median_age aged_65_older cardiovasc_death_rate diabetes_prevalence hosp
    > ital_beds_per_thousand life_expectancy human_development_index gdp_per_capita health_exp_pe
    > rcap urbanization_share internet_users air_passengers smokers_share, re vce(cluster country
    > )
    
    Random-effects GLS regression                   Number of obs     =      1,208
    Group variable: n_country                       Number of groups  =        102
    
    R-sq:                                           Obs per group:
         within  = 0.1866                                         min =          1
         between = 0.4488                                         avg =       11.8
         overall = 0.2671                                         max =         15
    
                                                    Wald chi2(31)     =     493.89
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
                                                (Std. Err. adjusted for 102 clusters in country)
    --------------------------------------------------------------------------------------------
                               |               Robust
         new_cases_per_million |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------------------+----------------------------------------------------------------
                           dt2 |  -75.12366   18.47727    -4.07   0.000    -111.3384   -38.90888
                           dt3 |  -60.00754   14.43205    -4.16   0.000    -88.29383   -31.72125
                           dt4 |   -71.3764   16.46268    -4.34   0.000    -103.6427   -39.11014
                           dt5 |  -66.31478   18.02942    -3.68   0.000    -101.6518   -30.97777
                           dt6 |  -42.08454   21.47367    -1.96   0.050    -84.17216    .0030893
                           dt7 |  -38.02067   20.46581    -1.86   0.063    -78.13291    2.091574
                           dt8 |  -27.06406   22.18253    -1.22   0.222    -70.54102    16.41291
                           dt9 |   59.42157   46.18094     1.29   0.198    -31.09141    149.9346
                          dt10 |   97.52294   44.70662     2.18   0.029     9.899574    185.1463
                          dt11 |   62.48241    30.8272     2.03   0.043     2.062212    122.9026
                          dt12 |    53.4497   31.68452     1.69   0.092    -8.650812    115.5502
                          dt13 |  -26.26253   20.36673    -1.29   0.197    -66.18057    13.65552
                          dt14 |  -47.72762   23.85059    -2.00   0.045    -94.47393   -.9813179
                          dt15 |   -36.0582   40.89168    -0.88   0.378    -116.2044    44.08802
        new_tests_per_thousand |   8.928846   2.758016     3.24   0.001     3.523234    14.33446
          people_vaccinated_ph |   4.495295    1.17263     3.83   0.000     2.196983    6.793607
                    population |  -3.01e-08   3.08e-08    -0.98   0.329    -9.05e-08    3.04e-08
            population_density |  -.0182255   .0055512    -3.28   0.001    -.0291057   -.0073453
                    median_age |   3.165761   3.813629     0.83   0.406    -4.308814    10.64034
                 aged_65_older |  -.8097854   4.401816    -0.18   0.854    -9.437185    7.817615
         cardiovasc_death_rate |   .0048464    .085041     0.06   0.955    -.1618308    .1715237
           diabetes_prevalence |   2.435293   3.213194     0.76   0.449     -3.86245    8.733037
    hospital_beds_per_thousand |  -4.038767   5.010392    -0.81   0.420    -13.85896    5.781422
               life_expectancy |  -.6207185   2.217349    -0.28   0.780    -4.966643    3.725206
       human_development_index |    1.88439   216.4545     0.01   0.993    -422.3585    426.1273
                gdp_per_capita |   .0001658   .0006376     0.26   0.795     -.001084    .0014155
             health_exp_percap |   .0030412    .007868     0.39   0.699    -.0123798    .0184623
            urbanization_share |   1.009087   .5140009     1.96   0.050     .0016637     2.01651
                internet_users |   -.260454   .8586334    -0.30   0.762    -1.943344    1.422437
                air_passengers |  -3.227438   2.171105    -1.49   0.137    -7.482726     1.02785
                 smokers_share |   2.625966   1.762373     1.49   0.136    -.8282209    6.080154
                         _cons |  -70.41636   125.5306    -0.56   0.575    -316.4519    175.6192
    ---------------------------+----------------------------------------------------------------
                       sigma_u |  62.525459
                       sigma_e |   127.3509
                           rho |  .19423163   (fraction of variance due to u_i)
    --------------------------------------------------------------------------------------------
    
    .
    . quietly predict y_hat,xbu
    
    .
    . quietly gen y_h_2=y_hat*y_hat
    
    . quietly gen y_h_3=y_h_2*y_hat
    
    . quietly gen y_h_4=y_h_3*y_hat
    
    .
    . quietly xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph po
    > pulation population_density median_age aged_65_older cardiovasc_death_rate diabetes_prevale
    > nce hospital_beds_per_thousand life_expectancy human_development_index gdp_per_capita healt
    > h_exp_percap urbanization_share internet_users air_passengers smokers_share y_h_2 y_h_3 y_h
    > _4, re vce(cluster country)
    
    .
    . test y_h_2 y_h_3 y_h_4
    
     ( 1)  y_h_2 = 0
     ( 2)  y_h_3 = 0
     ( 3)  y_h_4 = 0
    
               chi2(  3) =  527.79
             Prob > chi2 =    0.0000

  • #2
    Alessio:
    -sigma_u- and -sigma_e- values cast some doubts on the panel-wise effect.
    What is the outcome of -xttest0- after -xtreg,re-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,
      thank you very much for your help. -xttest0- yielded Prob > chibar2 = 0.0000, while the null hypothesis in both the Mundlak test and the Hansen-Sargan test were not rejected. Therefore, I opted for a random-effects model. I hope this can help.


      Code:
      . quietly xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph po
      > pulation population_density median_age aged_65_older extreme_poverty cardiovasc_death_rate
      > diabetes_prevalence hospital_beds_per_thousand life_expectancy human_development_index gdp_
      > per_capita health_exp_percap urbanization_share internet_users air_passengers smokers_share
      > , re vce(cluster country)
      
      .  
      . xttest0 // the test is highly significant and we reject the HO that variances are not diffe
      > rent, meaning we have to proceed with a panel data analysis
      
      Breusch and Pagan Lagrangian multiplier test for random effects
      
              new_cases_per_million[n_country,t] = Xb + u[n_country] + e[n_country,t]
      
              Estimated results:
                               |       Var     sd = sqrt(Var)
                      ---------+-----------------------------
                     new_cas~n |   22794.68       150.9791
                             e |   15599.25        124.897
                             u |   1725.955       41.54462
      
              Test:   Var(u) = 0
                                   chibar2(01) =    35.77
                                Prob > chibar2 =   0.0000
      
      .
      .
      . quietly asdoc xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated
      > _ph population population_density median_age aged_65_older cardiovasc_death_rate diabetes_p
      > revalence hospital_beds_per_thousand life_expectancy human_development_index gdp_per_capita
      >  health_exp_percap urbanization_share internet_users air_passengers  smokers_share mean_new
      > _tests mean_people_vaccinated , re vce(cluster country) replace
      
      .
      . quietly estimates store mundlak
      
      .
      . test mean_new_tests mean_people_vaccinated // We do not reject the null hypothesis. This su
      > ggests that time-invariant unobservables are not related to our regressors and that we can
      > proceed with a RE model.
      
       ( 1)  mean_new_tests = 0
       ( 2)  mean_people_vaccinated = 0
      
                 chi2(  2) =    0.24
               Prob > chi2 =    0.8870
      Thank you again
      Last edited by alessio lombini; 25 May 2021, 12:29.

      Comment


      • #4
        Alessio:
        thanks for clarifying.
        I would check whether the -re- model supports:
        - square age;
        - square gdp_per_capita.

        In addition, I would consider a more parsimonius model.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,
          thanks again for your useful comments. I followed both your suggestions. Concerning the squared terms I included in the same model both gdp_per_capita_squared and people_vaccinated_squared. After performing the test for both, I obtained the following results:

          Code:
          . utest gdp_per_capita gdp_pc_squared
          
          Specification: f(x)=x^2
          Extreme point:  72106.52
          
          Test:
               H1: U shape
           vs. H0: Monotone or Inverse U shape
          
          -------------------------------------------------
                           |   Lower bound      Upper bound
          -----------------+-------------------------------
          Interval         |         729           113369
          Slope            |   -.0009175         .0005304
          t-value          |   -.4334604         .4751281
          P>|t|            |    .3323789         .3173908
          -------------------------------------------------
          
          Overall test of presence of a U shape:
               t-value =      0.43
               P>|t|   =      .332
          
          . utest people_vaccinated_ph people_vaccinated_squared
          (588 missing values generated)
          (1,899 missing values generated)
          
          Specification: f(x)=x^2
          Extreme point:  29.41885
          
          Test:
               H1: Inverse U shape
           vs. H0: Monotone or U shape
          
          -------------------------------------------------
                           |   Lower bound      Upper bound
          -----------------+-------------------------------
          Interval         |           0             62.4
          Slope            |    13.49048        -15.12403
          t-value          |    5.841598        -3.618413
          P>|t|            |    3.32e-09         .0001543
          -------------------------------------------------
          
          Overall test of presence of a Inverse U shape:
               t-value =      3.62
               P>|t|   =   .000154
          
          .
          end of do-file
          If I understood well what stated in the article "With or without U?", by rejecting the H0 we allow for the inclusion of a squared term in our regression (the case of people_vaccinated_squared). The opposite it's true if the H0 is not rejected. May you confirm this interpretation?

          Also, concerning the problem of having a more pasimonius model, I performed a factor analysis of 5 variables, substituting them in the model with the factor I obtained (the KMO measure of sampling adequacy was 0.82).

          I would conclude with a last query. Do you think it would be feasible to perform the utest for all continuous control variables in the same model (e.g. adding a squared term of all continuous control variables in a single model, and then, controlling each at the time with the utest)?

          Thank you again for your precious time
          Best regards

          Comment


          • #6
            Alessio:
            your interpretation about square terms inclusion is correct.
            The only thing that I would check for is whether the extreme point is included within the range of the variable of interest (people_vaccinated_squared).
            The KMO measure is high enough to allow a low-dimensional representation of your data.
            Eventually, you should be better skimming the literature in your research field about other possible non-linearities concerning the data generating process you're interested in.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X