Testing linear relationship in panel model (after RESET)

alessio lombini

Join Date: Dec 2020
Posts: 98

Testing linear relationship in panel model (after RESET)

25 May 2021, 10:20

Hello everyone,

I am performing a panel regression analysis and I would like to know if my model is correctly specified. As the first step, I manually replicated the Ramsey RESET test which gave me a Prob > chi2 = 0.0000, meaning that the model is not correctly specified. Here the question, once we get that, there is a way to test (for all variables simultaneously) if a variable should be included as a quadratic term? I have searched in previous posts and it was suggested to do some plots to get clues. However, in my case, the plots are not really informative. I attach here the code in case it may be useful.

Thanks in advance to anyone who is willing to help.
Best regards

Code:

. xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph population
>  population_density median_age aged_65_older cardiovasc_death_rate diabetes_prevalence hosp
> ital_beds_per_thousand life_expectancy human_development_index gdp_per_capita health_exp_pe
> rcap urbanization_share internet_users air_passengers smokers_share, re vce(cluster country
> )

Random-effects GLS regression                   Number of obs     =      1,208
Group variable: n_country                       Number of groups  =        102

R-sq:                                           Obs per group:
     within  = 0.1866                                         min =          1
     between = 0.4488                                         avg =       11.8
     overall = 0.2671                                         max =         15

                                                Wald chi2(31)     =     493.89
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                            (Std. Err. adjusted for 102 clusters in country)
--------------------------------------------------------------------------------------------
                           |               Robust
     new_cases_per_million |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------------+----------------------------------------------------------------
                       dt2 |  -75.12366   18.47727    -4.07   0.000    -111.3384   -38.90888
                       dt3 |  -60.00754   14.43205    -4.16   0.000    -88.29383   -31.72125
                       dt4 |   -71.3764   16.46268    -4.34   0.000    -103.6427   -39.11014
                       dt5 |  -66.31478   18.02942    -3.68   0.000    -101.6518   -30.97777
                       dt6 |  -42.08454   21.47367    -1.96   0.050    -84.17216    .0030893
                       dt7 |  -38.02067   20.46581    -1.86   0.063    -78.13291    2.091574
                       dt8 |  -27.06406   22.18253    -1.22   0.222    -70.54102    16.41291
                       dt9 |   59.42157   46.18094     1.29   0.198    -31.09141    149.9346
                      dt10 |   97.52294   44.70662     2.18   0.029     9.899574    185.1463
                      dt11 |   62.48241    30.8272     2.03   0.043     2.062212    122.9026
                      dt12 |    53.4497   31.68452     1.69   0.092    -8.650812    115.5502
                      dt13 |  -26.26253   20.36673    -1.29   0.197    -66.18057    13.65552
                      dt14 |  -47.72762   23.85059    -2.00   0.045    -94.47393   -.9813179
                      dt15 |   -36.0582   40.89168    -0.88   0.378    -116.2044    44.08802
    new_tests_per_thousand |   8.928846   2.758016     3.24   0.001     3.523234    14.33446
      people_vaccinated_ph |   4.495295    1.17263     3.83   0.000     2.196983    6.793607
                population |  -3.01e-08   3.08e-08    -0.98   0.329    -9.05e-08    3.04e-08
        population_density |  -.0182255   .0055512    -3.28   0.001    -.0291057   -.0073453
                median_age |   3.165761   3.813629     0.83   0.406    -4.308814    10.64034
             aged_65_older |  -.8097854   4.401816    -0.18   0.854    -9.437185    7.817615
     cardiovasc_death_rate |   .0048464    .085041     0.06   0.955    -.1618308    .1715237
       diabetes_prevalence |   2.435293   3.213194     0.76   0.449     -3.86245    8.733037
hospital_beds_per_thousand |  -4.038767   5.010392    -0.81   0.420    -13.85896    5.781422
           life_expectancy |  -.6207185   2.217349    -0.28   0.780    -4.966643    3.725206
   human_development_index |    1.88439   216.4545     0.01   0.993    -422.3585    426.1273
            gdp_per_capita |   .0001658   .0006376     0.26   0.795     -.001084    .0014155
         health_exp_percap |   .0030412    .007868     0.39   0.699    -.0123798    .0184623
        urbanization_share |   1.009087   .5140009     1.96   0.050     .0016637     2.01651
            internet_users |   -.260454   .8586334    -0.30   0.762    -1.943344    1.422437
            air_passengers |  -3.227438   2.171105    -1.49   0.137    -7.482726     1.02785
             smokers_share |   2.625966   1.762373     1.49   0.136    -.8282209    6.080154
                     _cons |  -70.41636   125.5306    -0.56   0.575    -316.4519    175.6192
---------------------------+----------------------------------------------------------------
                   sigma_u |  62.525459
                   sigma_e |   127.3509
                       rho |  .19423163   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------------

.
. quietly predict y_hat,xbu

.
. quietly gen y_h_2=y_hat*y_hat

. quietly gen y_h_3=y_h_2*y_hat

. quietly gen y_h_4=y_h_3*y_hat

.
. quietly xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph po
> pulation population_density median_age aged_65_older cardiovasc_death_rate diabetes_prevale
> nce hospital_beds_per_thousand life_expectancy human_development_index gdp_per_capita healt
> h_exp_percap urbanization_share internet_users air_passengers smokers_share y_h_2 y_h_3 y_h
> _4, re vce(cluster country)

.
. test y_h_2 y_h_3 y_h_4

 ( 1)  y_h_2 = 0
 ( 2)  y_h_3 = 0
 ( 3)  y_h_4 = 0

           chi2(  3) =  527.79
         Prob > chi2 =    0.0000

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

25 May 2021, 11:22

Alessio:
-sigma_u- and -sigma_e- values cast some doubts on the panel-wise effect.
What is the outcome of -xttest0- after -xtreg,re-?

Kind regards,
Carlo
(Stata 19.0)
Comment

alessio lombini

Join Date: Dec 2020
Posts: 98

25 May 2021, 12:24

Dear Carlo,
thank you very much for your help. -xttest0- yielded Prob > chibar2 = 0.0000, while the null hypothesis in both the Mundlak test and the Hansen-Sargan test were not rejected. Therefore, I opted for a random-effects model. I hope this can help.

Code:

. quietly xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated_ph po
> pulation population_density median_age aged_65_older extreme_poverty cardiovasc_death_rate
> diabetes_prevalence hospital_beds_per_thousand life_expectancy human_development_index gdp_
> per_capita health_exp_percap urbanization_share internet_users air_passengers smokers_share
> , re vce(cluster country)

.  
. xttest0 // the test is highly significant and we reject the HO that variances are not diffe
> rent, meaning we have to proceed with a panel data analysis

Breusch and Pagan Lagrangian multiplier test for random effects

        new_cases_per_million[n_country,t] = Xb + u[n_country] + e[n_country,t]

        Estimated results:
                         |       Var     sd = sqrt(Var)
                ---------+-----------------------------
               new_cas~n |   22794.68       150.9791
                       e |   15599.25        124.897
                       u |   1725.955       41.54462

        Test:   Var(u) = 0
                             chibar2(01) =    35.77
                          Prob > chibar2 =   0.0000

.
.
. quietly asdoc xtreg new_cases_per_million dt2-dt15 new_tests_per_thousand people_vaccinated
> _ph population population_density median_age aged_65_older cardiovasc_death_rate diabetes_p
> revalence hospital_beds_per_thousand life_expectancy human_development_index gdp_per_capita
>  health_exp_percap urbanization_share internet_users air_passengers  smokers_share mean_new
> _tests mean_people_vaccinated , re vce(cluster country) replace

.
. quietly estimates store mundlak

.
. test mean_new_tests mean_people_vaccinated // We do not reject the null hypothesis. This su
> ggests that time-invariant unobservables are not related to our regressors and that we can
> proceed with a RE model.

 ( 1)  mean_new_tests = 0
 ( 2)  mean_people_vaccinated = 0

           chi2(  2) =    0.24
         Prob > chi2 =    0.8870

Thank you again

Last edited by alessio lombini; 25 May 2021, 12:29.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

25 May 2021, 23:52

Alessio:
thanks for clarifying.
I would check whether the -re- model supports:
- square age;
- square gdp_per_capita.

In addition, I would consider a more parsimonius model.

Kind regards,
Carlo
(Stata 19.0)
Comment

alessio lombini

Join Date: Dec 2020
Posts: 98

26 May 2021, 07:40

Dear Carlo,
thanks again for your useful comments. I followed both your suggestions. Concerning the squared terms I included in the same model both gdp_per_capita_squared and people_vaccinated_squared. After performing the test for both, I obtained the following results:

Code:

. utest gdp_per_capita gdp_pc_squared

Specification: f(x)=x^2
Extreme point:  72106.52

Test:
     H1: U shape
 vs. H0: Monotone or Inverse U shape

-------------------------------------------------
                 |   Lower bound      Upper bound
-----------------+-------------------------------
Interval         |         729           113369
Slope            |   -.0009175         .0005304
t-value          |   -.4334604         .4751281
P>|t|            |    .3323789         .3173908
-------------------------------------------------

Overall test of presence of a U shape:
     t-value =      0.43
     P>|t|   =      .332

. utest people_vaccinated_ph people_vaccinated_squared
(588 missing values generated)
(1,899 missing values generated)

Specification: f(x)=x^2
Extreme point:  29.41885

Test:
     H1: Inverse U shape
 vs. H0: Monotone or U shape

-------------------------------------------------
                 |   Lower bound      Upper bound
-----------------+-------------------------------
Interval         |           0             62.4
Slope            |    13.49048        -15.12403
t-value          |    5.841598        -3.618413
P>|t|            |    3.32e-09         .0001543
-------------------------------------------------

Overall test of presence of a Inverse U shape:
     t-value =      3.62
     P>|t|   =   .000154

.
end of do-file

If I understood well what stated in the article "With or without U?", by rejecting the H0 we allow for the inclusion of a squared term in our regression (the case of people_vaccinated_squared). The opposite it's true if the H0 is not rejected. May you confirm this interpretation?

Also, concerning the problem of having a more pasimonius model, I performed a factor analysis of 5 variables, substituting them in the model with the factor I obtained (the KMO measure of sampling adequacy was 0.82).

I would conclude with a last query. Do you think it would be feasible to perform the utest for all continuous control variables in the same model (e.g. adding a squared term of all continuous control variables in a single model, and then, controlling each at the time with the utest)?

Thank you again for your precious time
Best regards

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#6

26 May 2021, 11:50

Alessio:
your interpretation about square terms inclusion is correct.
The only thing that I would check for is whether the extreme point is included within the range of the variable of interest (people_vaccinated_squared).
The KMO measure is high enough to allow a low-dimensional representation of your data.
Eventually, you should be better skimming the literature in your research field about other possible non-linearities concerning the data generating process you're interested in.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Testing linear relationship in panel model (after RESET)

Comment

Comment

Comment

Comment

Comment