Alternative Estimation Strategy for Low R-Squared

Tash Steve

Join Date: May 2018

Posts: 2
#1

Alternative Estimation Strategy for Low R-Squared

02 May 2018, 06:15

Hello everyone,

I am new to this forum and also in working with statistical data, so let me know if there is the need for further explanation.
I am writing a paper about the effect of openness to trade to migration policies over time. To do so, I am taking for openness to trade the ratio of (import+export)/gdp and for the migration policies I have an index which lies between 0 (open) to 1 (restrictive). The data is available for the years 1980 to 2010. However, I am having some issues with finding the right estimation strategy. As I first thought it would be good to run a fixed-effects model, so I could control over time and different countries, I first ran a normal regression. The t-value was 0.009, so I thought this might be a good idea, but the r-squared was 0.0131. Next, I did a fe, which showed me an f-value of 0.0048, t-value of 0.005 and rho of 0.04, which is rather very low for correlation. Do you have any idea where the problem here is or if there is another way for estimation? An additional problem is that my control variables do not align for the whole period of time, resp. 1980-2010, but rather for some parts of it, like 1995-2005, etc. do you think this could be problematic too?
Thanks in advance!
Tags: None
Roman Mostazir

Join Date: Apr 2014

Posts: 870
#2

02 May 2018, 07:06

Explore the relationship between openness and the outcome variable with scatterplot, correlation matrix etc. If the correlation is low, openness will have low predicting power i.e. flat slope and you can't do much about it I am afraid. If x has no association with y, the only solution is to concluded that they are not associated rather try and dig hard to find an association.

Roman
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#3

03 May 2018, 10:27

I think you're probably better off not worrying about r-square. R-square is really about how much of the total variation in the dv your variable explains. But, no one would claim that openness as you define it is the major driver of migration.

By t-value, are you referring to the t statistic or the probability associated with the statistic? If it is really the t-statistic, then you can't say openness has a significant influence. If those are t statistics, then your data provide no support whatsoever for the proposition that openness influences migration. If this is the case, I might check that you don't have any outliers. For example, the range on openness should be quite limited. Also, running this with only one explanatory variable may hurt your results - it leaves a pile of unexplained variation which reduces your statistical significance (and few will believe a regression without any control variables).

While there is controversy over using hypothesis tests, a t statistic of .009 means there is absolutely no reason to believe the variable matters, a t-value associated with a p .009 says the opposite - there is good reason to believe the variable matters.

There is a distinction between statistical significance which associates with how likely it is to find specific betas and standard errors if the true association is zero, and practical significance which is how much the predicted dv changes with a change in the iv.

Assuming you mean p-values not t-values, you should use the margins command after the estimation to see how much a change in openness changes predicted migration.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

03 May 2018, 10:46

Tash:
welcome to this forum-
As an aside to previous helpful comments:
- if you have (as it seems) panel data, -regress- rarely outperforms -xtreg-. That said, if you want go/should go -regress- with panel data, you should impose a cluster standard errors, as your observations are not independent (within the same panel);
- if you (as you in all likelihood should) switch to -xtreg-, please note that the -fe- estimator focuses on the within variation with the same panel across the time span you chose, whereas it tells you basically nothing about variation between different panels (which is the job of -re- specification);
- no problem if you have gaps in your panels, as Stata can handle both balanced and unbalanced panel datasets.

Kind regards,
Carlo
(Stata 19.0)
Comment
Tash Steve

Join Date: May 2018

Posts: 2
#5

08 May 2018, 09:35

Thank you very much for your helpful information!
As I mentioned above, the index values lie between 0 (for open) to 1 (restrictive). Do you have any idea, how I could change them to the opposite (0=restrictive, 1=open)?
Thanks a lot!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#6

08 May 2018, 09:45

Tash:

Code:

recode index (0 = 1) (1 = 0)

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

Alternative Estimation Strategy for Low R-Squared

Comment

Comment

Comment

Comment

Comment