Dear All,
I hope you are doing well. I have an unbalanced panel dataset with a large N (>1600) and small T (15 years). My main outcome is fractional, ranging between 0 and 1, with rare endpoints, though 1 is more likely to occur.
Following the recent paper by Bates, Papke, and Wooldridge (2024), I estimated several models:
1. OLS:
2. FE:
3. GLM (Probit):
4. CRE (Probit):
5. CRE + Unbalancedness Adjustment:
My question concerns the interpretation of the time-averaged covariates included to relax the exogeneity assumption in the CRE model. In my preferred specification (Model 5), two variables show signs contrary to the literature and expected associations, whereas their time-averaged counterparts have the expected signs.
I would like to clarify:
Is there a direct relationship between a variable and its time-averaged version in terms of interpretation?
Do time-averaged variables have a meaningful interpretation beyond their role in adjusting for unobserved heterogeneity?
From my understanding, time-averaged covariates capture long-run (between-unit) effects, while their non-averaged counterparts represent short-run (within-unit) effects. The difference in signs may imply differing short- and long-term associations ???
I would appreciate your insights on whether this interpretation is correct or if there is something I may be overlooking.
Thank you very much for your guidance.
Kind regards,
I hope you are doing well. I have an unbalanced panel dataset with a large N (>1600) and small T (15 years). My main outcome is fractional, ranging between 0 and 1, with rare endpoints, though 1 is more likely to occur.
Following the recent paper by Bates, Papke, and Wooldridge (2024), I estimated several models:
1. OLS:
Code:
reg y x1 x2 i.year, cluster(id)
Code:
reg y x1 x2 $avg_controls i.year, cluster(id)
Code:
glm y x1 x2 i.year, family(bino) link(probit) cluster(id)
Code:
glm y x1 x2 $avg_controls i.year, family(bino) link(probit) cluster(id)
Code:
glm y x1 x2 $avg_controls i.year i.number_years, family(bino) link(probit) cluster(id)
I would like to clarify:
Is there a direct relationship between a variable and its time-averaged version in terms of interpretation?
Do time-averaged variables have a meaningful interpretation beyond their role in adjusting for unobserved heterogeneity?
From my understanding, time-averaged covariates capture long-run (between-unit) effects, while their non-averaged counterparts represent short-run (within-unit) effects. The difference in signs may imply differing short- and long-term associations ???
I would appreciate your insights on whether this interpretation is correct or if there is something I may be overlooking.
Thank you very much for your guidance.
Kind regards,
Comment