How to use standard deviation to interpret results?

Chen Huang

Join Date: Jan 2016

Posts: 33
#1

How to use standard deviation to interpret results?

05 Jun 2016, 12:39

Hello everyone,

Recently I noticed that many papers they use standard deviation to interpret the results. For exmple, in one paper, the table uses firms' leverage as dependent variable, and in the main explanatory variable-state corruption, the coefficient is 0.172( significant at 10% level), standard error is 0.098, sample size is 110,094. Then the paper says that " a one standard deviation increase in state corruption implies an increase in leverage euqal to 12.29% of mean leverage.

I really dont understand how to use the SD to interpret the results like that. I think this may be a silly question..but i would appricate if anyone can help me...

Chen
Tags: None

1 like
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

05 Jun 2016, 13:10

This usually arises in a context where the explanatory variable is entered into a regression model after it is standardized to a mean of zero and a standard deviation of 1. In that case, a 1 standard deviation increase in the explanatory variable is the same thing as a unit increase in the standardized version used in regression, and the effect on the outcome variable being reported is just the marginal effect or elasticity of that standardized explanatory variable.

When the explanatory variable has no natural metric or scale this may be an appropriate way to present results. Unfortunately, it is sometimes also seen in conjunction with variables which have obvious natural metrics such as age, or even with dichotomous variables. In that situation the effect, if not the intent, is merely obfuscatory. After all, who knows how much of an increase in age corresponds to 1 standard deviation in the study sample, or what the standard deviation of a dichotomous variable in some data sample is?
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#3

05 Jun 2016, 15:01

Originally posted by Clyde Schechter View Post

This usually arises in a context where the explanatory variable is entered into a regression model after it is standardized to a mean of zero and a standard deviation of 1. In that case, a 1 standard deviation increase in the explanatory variable is the same thing as a unit increase in the standardized version used in regression, and the effect on the outcome variable being reported is just the marginal effect or elasticity of that standardized explanatory variable.

When the explanatory variable has no natural metric or scale this may be an appropriate way to present results. Unfortunately, it is sometimes also seen in conjunction with variables which have obvious natural metrics such as age, or even with dichotomous variables. In that situation the effect, if not the intent, is merely obfuscatory. After all, who knows how much of an increase in age corresponds to 1 standard deviation in the study sample, or what the standard deviation of a dichotomous variable in some data sample is?

Thank you very much for your reply. But taking the example in my question, how did the author calculate the marginal effect or elasticity of that standardized explanatory variable? e.g. 12.29%?
And how can we standardize a dependent variable and enter it into regression? Many thanks.

Chen
Comment

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

05 Jun 2016, 15:30

Maybe you wish to take a look at the ado file - listcoef - under the SSC SPost13:

Below, an example on how to use it:

Code:

. use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
(1978 Automobile Data)

. tabstat mpg length foreign, statistics( sd )

   stats |       mpg    length   foreign
---------+------------------------------
      sd |  5.785503  22.26634  .4601885
----------------------------------------

. regress price c.length c.mpg i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(3, 70)        =     12.14
       Model |   217367689         3  72455896.3   Prob > F        =    0.0000
    Residual |   417697707        70   5967110.1   R-squared       =    0.3423
-------------+----------------------------------   Adj R-squared   =    0.3141
       Total |   635065396        73  8699525.97   Root MSE        =    2442.8

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      length |   59.61193   23.90525     2.49   0.015     11.93442    107.2894
         mpg |  -139.0814   82.20966    -1.69   0.095    -303.0434    24.88062
             |
     foreign |
    Foreign  |   2644.771   761.8912     3.47   0.001     1125.227    4164.315
       _cons |  -2861.984     6026.6    -0.47   0.636    -14881.66     9157.69
------------------------------------------------------------------------------

. listcoef, help

regress (N=74): Unstandardized and standardized estimates 

  Observed SD:  2.9e+03
  SD of error:  2.4e+03

-------------------------------------------------------------------------------
             |         b        t    P>|t|    bStdX    bStdY   bStdXY     SDofX
-------------+-----------------------------------------------------------------
      length |   59.6119    2.494    0.015  1327.340    0.020    0.450    22.266
         mpg | -139.0814   -1.692    0.095  -804.656   -0.047   -0.273     5.786
             |
     foreign |
    Foreign  | 2644.7712    3.471    0.001  1217.093    0.897    0.413     0.460
    constant | -2.86e+03   -0.475    0.636        .        .        .         .
-------------------------------------------------------------------------------
       b = raw coefficient
       t = t-score for test of b=0
   P>|t| = p-value for t-test
   bStdX = x-standardized coefficient
   bStdY = y-standardized coefficient
  bStdXY = fully standardized coefficient
   SDofX = standard deviation of X

Best,

Marcos

Best regards,

Marcos

Comment

Chen Huang

Join Date: Jan 2016
Posts: 33

05 Jun 2016, 15:37

Originally posted by Marcos Almeida View Post

Maybe you wish to take a look at the ado file - listcoef - under the SSC SPost13:

Below, an example on how to use it:

Code:

. use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
(1978 Automobile Data)

. tabstat mpg length foreign, statistics( sd )

stats | mpg length foreign
---------+------------------------------
sd | 5.785503 22.26634 .4601885
----------------------------------------

. regress price c.length c.mpg i.foreign

Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(3, 70) = 12.14
Model | 217367689 3 72455896.3 Prob > F = 0.0000
Residual | 417697707 70 5967110.1 R-squared = 0.3423
-------------+---------------------------------- Adj R-squared = 0.3141
Total | 635065396 73 8699525.97 Root MSE = 2442.8

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
length | 59.61193 23.90525 2.49 0.015 11.93442 107.2894
mpg | -139.0814 82.20966 -1.69 0.095 -303.0434 24.88062
|
foreign |
Foreign | 2644.771 761.8912 3.47 0.001 1125.227 4164.315
_cons | -2861.984 6026.6 -0.47 0.636 -14881.66 9157.69
------------------------------------------------------------------------------

. listcoef, help

regress (N=74): Unstandardized and standardized estimates

Observed SD: 2.9e+03
SD of error: 2.4e+03

-------------------------------------------------------------------------------
| b t P>|t| bStdX bStdY bStdXY SDofX
-------------+-----------------------------------------------------------------
length | 59.6119 2.494 0.015 1327.340 0.020 0.450 22.266
mpg | -139.0814 -1.692 0.095 -804.656 -0.047 -0.273 5.786
|
foreign |
Foreign | 2644.7712 3.471 0.001 1217.093 0.897 0.413 0.460
constant | -2.86e+03 -0.475 0.636 . . . .
-------------------------------------------------------------------------------
b = raw coefficient
t = t-score for test of b=0
P>|t| = p-value for t-test
bStdX = x-standardized coefficient
bStdY = y-standardized coefficient
bStdXY = fully standardized coefficient
SDofX = standard deviation of X

Best,

Marcos

Thank you very much for this, actually I noticed that for finance papers regarding corruption, they all use standard interpretation to interpret the results.. This is interesting, i think i need to look into the issue carefully.

Comment

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#6

06 Jun 2016, 11:35

A slightly more primitive way to do this is to think about a standard deviation change in x as simply a number. So you estimate the standard deviation of x in the estimation sample using the summary routine. Then you use margins to generate the predicted y for two values of x one standard deviation apart.

So, is sd is 2, and everything is linear, you want margins to give you predicted y for x=0 and x=2. The difference is the change in y for a one sd change in x.

Phil
Comment
Aaditya Dar

Join Date: Sep 2014

Posts: 114
#7

07 Jun 2016, 07:40

The "fully standardized coefficient" are also known by beta coefficients (in case you want to read more about this in an econometrics textbook).
Comment
Jan Gehling

Join Date: Apr 2018

Posts: 2
#8

30 May 2018, 10:58

Hello all,

I found this thread since I have the same problem as Chen Huang. I looked into the command listcoef and actually it perfectly suits my needs since I‘m interested in bStdX.

My understanding of bStdX: These are the regression coefficients with the x-variables (the independent variables) in standard deviations and the y-variable (the dependent variable) in its original units.

However I‘m using a user written regression command called xtfmb (Fama MacBeth two-step panel regression) and that doesn‘t work with listcoef.

Do you have any idea how I still could get the results? Maybe you even have a code example.

Many thanks already,
Jan
Comment
Ayub UOM

Join Date: Feb 2018

Posts: 83
#9

21 Aug 2019, 03:37

Hello all,
I read in a paper that the coefficient estimate of Independent variable(IV) ( coefficient value −0.00821 t-statistic (9.04)***) is significant and negatively associated with the Dependent variable(DV)
in the regression at the 1% level. Specifically, a one-unit increase in IV reduces the DV by 0.00821, which represents 29% of the average DV.
In the above case the Standard deviation(SD) of IV is 0.325 and SD of DV is 0.033.

Q1: my question is I don't know how they calculated this 29% ? manually or through stata?

Normally for economic significance, we are using this formula( coefficient of Independent variable * Standard deviation of Independent variable)/Standard deviation of Dependent variable .
As in below example li et al 2017 Trust and Stock Price Crash Risk: Evidence from China

This negative relationship between crash risk and social trust is both statistically and economically significant. For example, the coefficient of TRUST1t (column 1) is -0.0193,
which means that a one-standard-deviation increase in the social trust of a firm location is associated with a decrease of 1.94% (=0.0193*0.6866/0.6843) of a standard deviation in
future crash risk as measured by NCSKEW, ceteris paribus.TRUST1t Standard deviation is 0.6866 and NCSKEW Standard deviation is 0.6843)

Q2: I want to calculate the economic significance or predictive margin for my study.
Many Thanks
Ayub
Comment
Ayub UOM

Join Date: Feb 2018

Posts: 83
#10

19 Jun 2020, 09:17

Hello stata list members,
i would like to ask my previous question #9 again, but this time i can add some more values, the mean value of Dependent variable is 0.028 and the coefficient of IV is -0.00821, and they get (29% of the average DV).so can i divide the mean value of DV by the coefficient value of IV (-0.00821)? to get 29%(-0.00821/0.028)= 29%. which represent 29% of the average DV.
Q4. from one another paper they calculated it in other way, so could you pleas suggest me some relevant links on this formula, (coefficient estimat on CEO power*one standard deviation change in CEO Power)/Average Board Diversity for the sample) =(-0/0436*0.586)/13.1=1.95% ( 1 standard deviation increase in CEO power (SD =0.586) is associated with a decrease in Board diversity of 1.95%. A decrease of 1.95% board hetrogenity is equalent to replaceing one domestic director with foreign nationality director. But actually when i am looking for.the value 0.586 , i could not find it in.disciptive statistic table. so may be.there are some other codes to calculat it.
looking for your kind suggestion

Last edited by Ayub UOM; 19 Jun 2020, 09:23.
Comment

Announcement