Hi Statalist,
Happy new year!
I have questions regarding using two part model in panel data. I want to analyse health experditures and chronic condition, and want to estimate total health expenditure expenditures condtional on a set of covariates (e.g. age, gender,....) and calculate marginal effect of chronic condition variables. I have two waves of panel data with repeated measures. Sample size is around 40,000 per wave. The health expenditure outcome has massive zeros (~30%).
There are literatures comparing different models, e.g. linear OLS on natural scale, OLS on log transformed expenditure, GLM, Poisson, two part model, selection model etc., in cross-sectional settings. I would like to apply two part model with first part logit model and second part - log link and gamma distribution, but in longitidunal settings.
First question, when using
I get unconditional/combined marginal effects from both parts of two-part model.
If using
I get unconditional margins based on the sample of positives.
If using
, I get conditional marginal effects for the sample of positives.
Are my understanding correct here?
Second question, is it feasible to run two-part model in panel data? If I run them seperately, how do I get a combined marginal effects from both models?
Many thanks in advance.
Tian Xin
Happy new year!
I have questions regarding using two part model in panel data. I want to analyse health experditures and chronic condition, and want to estimate total health expenditure expenditures condtional on a set of covariates (e.g. age, gender,....) and calculate marginal effect of chronic condition variables. I have two waves of panel data with repeated measures. Sample size is around 40,000 per wave. The health expenditure outcome has massive zeros (~30%).
There are literatures comparing different models, e.g. linear OLS on natural scale, OLS on log transformed expenditure, GLM, Poisson, two part model, selection model etc., in cross-sectional settings. I would like to apply two part model with first part logit model and second part - log link and gamma distribution, but in longitidunal settings.
First question, when using
Code:
twopm yvar $xvar, margins, dydx(*)
If using
Code:
twopm yvar $xvar, margins if yvar >0, dydx(*)
If using
Code:
glm yvar $xvar if yvar >0, margins, dydx(*)
Are my understanding correct here?
Second question, is it feasible to run two-part model in panel data? If I run them seperately, how do I get a combined marginal effects from both models?
Many thanks in advance.
Tian Xin
Comment