Dear all,
I am trying to run Cragg hurdle regression, the -churdle- command, using Stata. The general motivation is as follows: I have a continuous dependent variable with a lot of observed zeros (149 zeros out of 290 observations). The nature of these zeros are observed or genuine zeros, in other words they are not missing or unobserved values, and I would also like to differentiate between what affects the probability of the outcome being non-zero and what affects the truncated part of the regression. I have 19 continuous dependent variables and 1 binary dependent variable.
My codes look more or less like this:
Note that I consider the same set of independent variables (-$indvars-) for both the Probit and truncated regression part of Cragg hurdle regression with the addition of -indvar2- for the Probit part, but again I would like to know if these independent variables affect each part differently. (I also didn't attach the dataset I'm working on but I think it won't be necessary for my questions)
Question 1
To my understanding, Cragg hurdle regression basically do a Probit and a truncated regression (Correct?). My -churdle- output shows a Prob > chi2 of 0.000, yet if I computed -probit- and -truncreg- separately for the exact same specification and dataset, my -probit- output shows a Prob > chi2 of 0.000 but the -truncreg- output shows a Prob > chi2 of 0.2462. Iām unsure of what I can take from this: Does it mean that I cannot infer anything from the truncated part of my -churdle- model but only the Probit part? Or should I worry about it for a different reason perhaps?
Question 2
According to the -churdle- manual from Stata, āThe coefficient estimates are not directly interpretable. To obtain the effect of a covariate on the model, we need to use the margins command.ā For the binary/selection part, as it is essentially Probit, I understand why it is the case to calculate the marginal effect first to interpret the coefficients, but how about the continuous/truncated regression part? Since I believe -truncreg- results can be interpreted directly just like OLS.
Let me put it this way, say I computed the following
From which part of -churdle- do those marginal effect values come from?
Question 3
Lastly, other than log-likelihood or adjusted R-square, what are other postestimation diagnostics which may be appropriate/useful to report when using Cragg?
Stata is v17.0 Basic Edition (BE) working on Windows 10.
Thank you!
I am trying to run Cragg hurdle regression, the -churdle- command, using Stata. The general motivation is as follows: I have a continuous dependent variable with a lot of observed zeros (149 zeros out of 290 observations). The nature of these zeros are observed or genuine zeros, in other words they are not missing or unobserved values, and I would also like to differentiate between what affects the probability of the outcome being non-zero and what affects the truncated part of the regression. I have 19 continuous dependent variables and 1 binary dependent variable.
My codes look more or less like this:
Code:
import delimited "SE_data" global indvars indvar1 indvar3 indvar4 indvar5 indvar6 indvar7 indvar8 indvar9 indvar10 indvar11 i.indvar12_cat indvar13 indvar14 indvar15 indvar16 indvar17 indvar18 indvar19 indvar20 churdle lin depvar $indvars, select($indvars indvar2) ll(0)
Question 1
To my understanding, Cragg hurdle regression basically do a Probit and a truncated regression (Correct?). My -churdle- output shows a Prob > chi2 of 0.000, yet if I computed -probit- and -truncreg- separately for the exact same specification and dataset, my -probit- output shows a Prob > chi2 of 0.000 but the -truncreg- output shows a Prob > chi2 of 0.2462. Iām unsure of what I can take from this: Does it mean that I cannot infer anything from the truncated part of my -churdle- model but only the Probit part? Or should I worry about it for a different reason perhaps?
Question 2
According to the -churdle- manual from Stata, āThe coefficient estimates are not directly interpretable. To obtain the effect of a covariate on the model, we need to use the margins command.ā For the binary/selection part, as it is essentially Probit, I understand why it is the case to calculate the marginal effect first to interpret the coefficients, but how about the continuous/truncated regression part? Since I believe -truncreg- results can be interpreted directly just like OLS.
Let me put it this way, say I computed the following
Code:
churdle lin depvar $indvars, select($indvars indvar2) ll(0) margins, dydx(indvar1)
From which part of -churdle- do those marginal effect values come from?
Question 3
Lastly, other than log-likelihood or adjusted R-square, what are other postestimation diagnostics which may be appropriate/useful to report when using Cragg?
Stata is v17.0 Basic Edition (BE) working on Windows 10.
Thank you!