Hello!
I want to compare which of two variables better explains my dependent variable, which is country-level cigarette consumption - I'll call the two independent variables I want to compare Most Interesting Variable 1 (MIV1) and Most Interesting Variable 2 (MIV2).
Because MIV1 and MIV2 are highly correlated, I don't want to include them in the same regression. So I have decided to split the model of factors that explain country-level cigarette consumption into two models that only differ in whether they include MIV1 or MIV2. I therefore run a model as shown in eq(1) and a model as shown in eq(2), where Xjt is a vector of country-level macroeconomic factors. The regression is (initially) run using OLS with country and year fixed effects. Standard errors are clustered at the country-level and the panel of 56 countries over 5 years is balanced in both cases.
Yjt= β0 + β1MIV1jt + β2Xjt + δt + αj + uit (Eq1)
Yjt= β0 + β1MIV2jt + β2Xjt + δt + αj + uit (Eq2)
Question:
1. Does one use the size and statistical significance of β1 in each model, or the R-squared of each model, as the basis of saying whether MIV1 or MIV2 better explains the dependent variable, y?
2. If it is best-practice to compare the R-squares between models that only differ by one independent variable as in eq1 and eq2, how should one conduct such a comparison in the case of a dynamic panel model which includes the first lag of the dependent variable when this is run with something like systems GMM, which doesn't give you an R-squared?
Thank you!
Sam
I want to compare which of two variables better explains my dependent variable, which is country-level cigarette consumption - I'll call the two independent variables I want to compare Most Interesting Variable 1 (MIV1) and Most Interesting Variable 2 (MIV2).
Because MIV1 and MIV2 are highly correlated, I don't want to include them in the same regression. So I have decided to split the model of factors that explain country-level cigarette consumption into two models that only differ in whether they include MIV1 or MIV2. I therefore run a model as shown in eq(1) and a model as shown in eq(2), where Xjt is a vector of country-level macroeconomic factors. The regression is (initially) run using OLS with country and year fixed effects. Standard errors are clustered at the country-level and the panel of 56 countries over 5 years is balanced in both cases.
Yjt= β0 + β1MIV1jt + β2Xjt + δt + αj + uit (Eq1)
Yjt= β0 + β1MIV2jt + β2Xjt + δt + αj + uit (Eq2)
Question:
1. Does one use the size and statistical significance of β1 in each model, or the R-squared of each model, as the basis of saying whether MIV1 or MIV2 better explains the dependent variable, y?
2. If it is best-practice to compare the R-squares between models that only differ by one independent variable as in eq1 and eq2, how should one conduct such a comparison in the case of a dynamic panel model which includes the first lag of the dependent variable when this is run with something like systems GMM, which doesn't give you an R-squared?
Thank you!
Sam
Comment