Different Results for time-Lagged variables

Vincent Wey

Join Date: Apr 2022

Posts: 12
#1

Different Results for time-Lagged variables

12 Apr 2022, 10:18

Hello guys!

I'm new to STATA (and self-taught) and I wanted to do an OLS regression-model using time-lagged variables. Without the time-lags, my p-value for the independent variable I'm looking for is quite high (p>t=0,223), with the model including the lag it's p>t=0,039 (you can see the model further down). I don't know how to interpret this change of significance, would you please help me?

Thanks much in advance!

Without lagged variables

Source | SS df MS Number of obs = 150
-------------+---------------------------------- F(4, 145) = 14.88
Model | 7803.40219 4 1950.85055 Prob > F = 0.0000
Residual | 19016.6585 145 131.149369 R-squared = 0.2910
-------------+---------------------------------- Adj R-squared = 0.2714
Total | 26820.0607 149 180.000407 Root MSE = 11.452

------------------------------------------------------------------------------
cab_PExp | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
cab_LeRi | -.7008049 .5730767 -1.22 0.223 -1.833468 .431858
cab_MltPct | 6.796478 1.517835 4.48 0.000 3.796539 9.796418
cab_unemplRt | .5339775 .2528204 2.11 0.036 .0342881 1.033667
cab_grwGDP | -.308078 .5443009 -0.57 0.572 -1.383867 .7677108
_cons | 49.63023 4.257338 11.66 0.000 41.21577 58.04468

With lagged variables

Source | SS df MS Number of obs = 137
-------------+---------------------------------- F(4, 132) = 7.08
Model | 4293.10686 4 1073.27672 Prob > F = 0.0000
Residual | 19997.2491 132 151.494312 R-squared = 0.1767
-------------+---------------------------------- Adj R-squared = 0.1518
Total | 24290.356 136 178.605559 Root MSE = 12.308

---------------------------------------------------------------------------------
cab_PExp | Coefficient Std. err. t P>|t| [95% conf. interval]
----------------+----------------------------------------------------------------
cab_LeRi | -1.324766 .636785 -2.08 0.039 -2.58439 -.0651427
cab_MltPct | 3.108642 1.349349 2.30 0.023 .439496 5.777787
cab_unemplRt_L1 | .5619421 .2718256 2.07 0.041 .0242441 1.09964
cab_grwGDP_L1 | .0634581 .5927974 0.11 0.915 -1.109154 1.23607
_cons | 56.26583 4.518824 12.45 0.000 47.32715 65.20451
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17613
#2

12 Apr 2022, 10:50

Vincent:
welcome to this forum.
Two different codes gives back results that (in part) differ: no wonder about that.
That said, what I would consider more carefully is the Adj R-squared, that declines in your second OLS (in short, the first regression is better that the second one).
In addition, both regression seem to be the residual realm: hence, you have to add more predictors and/or interactions in the right-hand side of your regression equations to give a fair(er) and true(r) view of the data generating process you're interested in.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Vincent Wey

Join Date: Apr 2022

Posts: 12
#3

12 Apr 2022, 15:00

Thank you very much for your answer, Carlo! So what you're saying is, that the change of the p-value is less important than the Adj R-squared? I'm using data from big data samples, so i didn't expect a high R-squared anyways.

Best regards,
Vincent
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17613
#4

13 Apr 2022, 04:38

Vincent:
yes, that's what I mean.
It not a matter of hig/low R-sq (which can be explained by different, research field-specific reasons), but of Adj R_sq, that, other things being equal, decreases when (to make it short) the right-hand side of your second regression equation is less informative/efficient than your first code.

Kind regards,
Carlo
(StataNow 18.5)
Comment

Announcement

Different Results for time-Lagged variables

Comment

Comment

Comment