Insignificant results: What can I do?

Leo Nuss

Join Date: Jun 2017

Posts: 6
#1

Insignificant results: What can I do?

07 Jul 2017, 10:50

Dear Statalistas,

in my regression Growth per capita as percentage of GDP in 2014 is my dependent variable. I have quite some control variables (including log GDP of the previous year, fertility rate, tertiary education, life expectancy, urbanization rate, inflation, population aged <15, population aged +15, ratio of foreign investments to GDP and ratio of government spending to GDP) but my results are not significant. What can I do? I do not have the option to include more observations.

Best
Leo

reg Growth Log_GDP_pC_2013 Inflation Foreign_Investments_ofGDP GovernmentSpending_ofGDP Lif
> eExpectancy Education_Tertiary Urbanizationrate Population65 FertilityRate

Source | SS df MS Number of obs = 41
-------------+---------------------------------- F(9, 31) = 3.06
Model | 76.4974593 9 8.4997177 Prob > F = 0.0097
Residual | 86.1361336 31 2.77858496 R-squared = 0.4704
-------------+---------------------------------- Adj R-squared = 0.3166
Total | 162.633593 40 4.06583982 Root MSE = 1.6669

-------------------------------------------------------------------------------------------
Growth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------------------+----------------------------------------------------------------
Log_GDP_pC_2013 | .0394968 1.673602 0.02 0.981 -3.373837 3.452831
Inflation | -.1872666 .129307 -1.45 0.158 -.4509899 .0764567
Foreign_Investments_ofGDP | .0412949 .0406959 1.01 0.318 -.041705 .1242949
GovernmentSpending_ofGDP | -.122471 .0886844 -1.38 0.177 -.303344 .0584019
LifeExpectancy | .0001113 .0912691 0.00 0.999 -.1860332 .1862558
Education_Tertiary | -.02221 .0276 -0.80 0.427 -.0785007 .0340807
Urbanizationrate | -.048897 .0289234 -1.69 0.101 -.1078867 .0100926
Population65 | -.0565843 .0557908 -1.01 0.318 -.1703704 .0572018
FertilityRate | -.1037417 .7621697 -0.14 0.893 -1.658197 1.450714
_cons | 9.167969 6.21429 1.48 0.150 -3.50616 21.8421
------------------------------------------------------------------------------------------
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

07 Jul 2017, 11:02

Much as I dislike focusing on statistical significance at all, you have overlooked the line in the output header that says Prob > F = 0.0097. So your overall model is "significant" and in fact accounts for 47% of the variance, which is quite hefty. You are disappointed, I imagine, because none of your individual predictors turns out to be "significant" in its own right. This kind of situation can arise because of high correlations among your predictor variables. So they are all competing with each other to explain outcome variance, and they are all sufficiently good competitors that the variance is being spread pretty evenly among them, and none stands out as "significant."

So take a look at the correlation matrix for your predictors, or do it graphically with -graph matrix-. Then eliminate some of the redundant variables. I'm not an economist, so I'm going out on a limb here, but, for example, I would imagine that your government spending variable and the log gdp from 2013 are pretty redundant of each other. I would also expect that tertiary education and urbanization are pretty strongly correlated, and that both of those would be rather negatively correlated with fertility rate. But let the data guide you on picking out a model with fewer highly-correlated predictors and you will likely find some "significant" results.

That said, I hope you realize that trying models out until you finally find a "significant" result is not science: it's just mining Type I errors, and if done knowingly, some consider it scientific misconduct. You really should go into these analyses with a pre-specified hypothesis. And if it doesn't pan out, then it doesn't pan out.
1 like
Comment
Leo Nuss

Join Date: Jun 2017

Posts: 6
#3

07 Jul 2017, 11:09

Thank you for the answer.
I do know that I cannot alter variables in order to find a significant result. However all of these are thought of as control variables because I actually want to measure the effect of religion, which is not yet included. That is why I first need a general model, which serves well for growth in order to determine the effect of my actual research question.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

07 Jul 2017, 12:35

Understood. Good luck. If the variables shown in #1 are just there to be adjusted for, then it really doesn't matter whether they are "significant" or not. The correlations among them in no way diminish the adjustment (control). If those variables not the actual effects of interest in your research, then you should just ignore their p-values anyway.
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#5

07 Jul 2017, 13:32

Leo,

Beyond the issue of multicolinearity, which, as Clyde advised, is not a problem for control variables, it appears you have a problem with power.

In your example, you have only 41 cases with 9 control variables plus the religion variable you intend to add as an independent variable.

You can run a power analysis to determine the sample size you would need by using Stata's -power rsquared- command as shown below. This assumes that:
the effect of adding religion the model will be to increase R-squared by at least 10% after the effects of the controls

you have 9 control variables plus your added religion independent variable

you want to have 80% power, and

your Type I error criterion is 5%.

Code:

. power rsquared 0.0, ncontrol(9) ntested(1) diff(.10) Performing iteration ... Estimated sample size for multiple linear regression F test for R2 testing subset of coefficients Ho: R2_F = R2_R versus Ha: R2_F != R2_R Study parameters: alpha = 0.0500 power = 0.8000 delta = 0.1111 R2_R = 0.0000 R2_F = 0.1000 R2_diff = 0.1000 ncontrol = 9 ntested = 1 Estimated sample size: N = 73

If you cannot collect additional data, then you can calculate the power of the analysis with n = 41 as follows:

Code:

. power rsquared 0.0, n(41) ncontrol(9) ntested(1) diff(.10) Estimated power for multiple linear regression F test for R2 testing subset of coefficients Ho: R2_F = R2_R versus Ha: R2_F != R2_R Study parameters: alpha = 0.0500 N = 41 delta = 0.1111 R2_R = 0.0000 R2_F = 0.1000 R2_diff = 0.1000 ncontrol = 9 ntested = 1 Estimated power: power = 0.5422

If you expect the effect of religion to be a change in R-squared of 5% or less, then the estimated power to find the effect at a statistically significant level would drop to less than 30%.

Hope that helps as you work on your model.

Good luck.

Red Owl
Comment

Announcement

Insignificant results: What can I do?

Comment

Comment

Comment

Comment