1.A) If the industry dummies are time-invariant, then only code 1.2 would be appropriate. 1 of the industry dummies should normally be omitted due to the "dummy trap", i.e. all 8 industry dummies are perfectly collinear with the intercept. If more dummies are omitted (either from the regressor list or the instrument list), then this indicates that there might be other multicollinearity problems as well, which I cannot tell from the available information.
If the industry dummies vary over time, then 1.1 and 1.3 would also be appropriate codes. A similar qualification as before applies: At least one dummy will be omitted due to perfect collinearity.
1.B) As I said in 1.A, at least one dummy will be omitted. Stata will automatically omit one dummy at random. If you want to omit a specific dummy, which shall serve as a reference industry, then you need to omit it manually.
1.C) You can include all industry dummies as instruments. Stata will automatically omit at least one due to perfect collinearity. It does not matter which dummy is omitted in the list of instruments.
1.D) Option nolevel is recommended if you do not have any instruments for the level model and you want a conventional difference/FOD estimator.
2.1) xtdpdgmmfe automatically selects the appropriate instruments / moment conditions (and therefore the relevant estimator) corresponding to the chosen assumptions.
2.2) xtdpdgmm can estimate a model with the Chudik-Pesaran nonlinear moment conditions even if some regressors are treated as endogenous. However, the resulting estimator would be inconsistent. xtdpdgmm does not check whether you have chosen the options in a consistent way. Therefore, xtdpdgmmfe is less prone to such errors.
3) Both commands can be specified accordingly; see post #450 for an example.
4) Option curtail() of xtdpdgmmfe can be used to set a maximum lag depth of 3 for all sets of instruments. For an endogenous variable, this would use lags 2 and 3. For a predetermined variable, this would use lags 1 to 3. The command is less flexible regarding individual lag orders for different variables. You also cannot easily specify lags 2 to 4 for endogenous but lags 1 to 3 for predetermined variables. This is intentional to reduce the temptation for researchers to search for the "nicest" model. Keeping the maximum lag order constant is the least arbitrary approach. It will give predetermined variables one more instrument than endogenous variables. Again, this is intentional as it utilizes the additional overidentifying restriction from making the stronger predeterminedness assumption.
5) The sequential model selection process is not required. It is merely a suggestion to reduce the arbitrariness of the modeling choice.
6) The doubly-corrected robust standard errors are generally recommended.
7) No, lag() is an abbreviation of lagrange().
8) These commands do not support nonlinear models for limited dependent variables, only the linear probability model.
If the industry dummies vary over time, then 1.1 and 1.3 would also be appropriate codes. A similar qualification as before applies: At least one dummy will be omitted due to perfect collinearity.
1.B) As I said in 1.A, at least one dummy will be omitted. Stata will automatically omit one dummy at random. If you want to omit a specific dummy, which shall serve as a reference industry, then you need to omit it manually.
1.C) You can include all industry dummies as instruments. Stata will automatically omit at least one due to perfect collinearity. It does not matter which dummy is omitted in the list of instruments.
1.D) Option nolevel is recommended if you do not have any instruments for the level model and you want a conventional difference/FOD estimator.
2.1) xtdpdgmmfe automatically selects the appropriate instruments / moment conditions (and therefore the relevant estimator) corresponding to the chosen assumptions.
2.2) xtdpdgmm can estimate a model with the Chudik-Pesaran nonlinear moment conditions even if some regressors are treated as endogenous. However, the resulting estimator would be inconsistent. xtdpdgmm does not check whether you have chosen the options in a consistent way. Therefore, xtdpdgmmfe is less prone to such errors.
3) Both commands can be specified accordingly; see post #450 for an example.
4) Option curtail() of xtdpdgmmfe can be used to set a maximum lag depth of 3 for all sets of instruments. For an endogenous variable, this would use lags 2 and 3. For a predetermined variable, this would use lags 1 to 3. The command is less flexible regarding individual lag orders for different variables. You also cannot easily specify lags 2 to 4 for endogenous but lags 1 to 3 for predetermined variables. This is intentional to reduce the temptation for researchers to search for the "nicest" model. Keeping the maximum lag order constant is the least arbitrary approach. It will give predetermined variables one more instrument than endogenous variables. Again, this is intentional as it utilizes the additional overidentifying restriction from making the stronger predeterminedness assumption.
5) The sequential model selection process is not required. It is merely a suggestion to reduce the arbitrariness of the modeling choice.
6) The doubly-corrected robust standard errors are generally recommended.
7) No, lag() is an abbreviation of lagrange().
8) These commands do not support nonlinear models for limited dependent variables, only the linear probability model.
Comment