Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scaling variable and its impact on regression coefficients

    Dear Stata forum
    Very recently I read that use of deflators or scaling variables to create ratios can change regression coefficients. I will add to this later but for the time being, I have a sample dataset below.

    Code:
     *Example generated by -dataex-. For more info, type help dataex
    clear
    input int(totalassets liquidassets sales) byte risk
    15000 1050 1500  5
    14000 1200 1450  6
    15000  950 1400  8
    12050 1100 1300 12
    11000  550  855 18
     7515  688  900 20
     9650  793  950 18
     6500  434  800 16
     6981  569  750 16
     5956  493  850 13
     6699  870  950 11
    16000  956 1350  9
    11000 1215 1400  7
     8976 1910 3250  4
     7695  918 1250  2
    end
    Code:
    correl totalassets liquidassets sales risk
    (obs=15)
    
                 | totala~s liqui~ts    sales     risk
    -------------+------------------------------------
     totalassets |   1.0000
    liquidassets |   0.3712   1.0000
           sales |   0.2636   0.9329   1.0000
            risk |  -0.3961  -0.7016  -0.6417   1.0000
    Note that -risk- is the variable of interest. From the correlation it is clear that risk is negatively and strongly correlated with liquidassets, sales and totalassets. From the correlation it is also evident that liquidassets varies more with risk than totalassets unsurprisingly because totalssets consists of assets other than liquidassets that are less responsive to risk.
    Next, I created two variables as follows

    Code:
    gen liquid_total= liquidassets/ totalassets           // scaling variable is totalassets that is relatively less responsive to risk than sales
    gen liquid_sales= liquidassets/ sales                              // scaling variable is sales that is relatively more responsive to risk than totalassets
    Then I ran the following simple regressions!

    Code:
    reg liquid_total risk
    
          Source |       SS           df       MS      Number of obs   =        15
    -------------+----------------------------------   F(1, 13)        =      3.23
           Model |  .004414562         1  .004414562   Prob > F        =    0.0954
        Residual |  .017753991        13  .001365692   R-squared       =    0.1991
    -------------+----------------------------------   Adj R-squared   =    0.1375
           Total |  .022168553        14  .001583468   Root MSE        =    .03696
    
    ------------------------------------------------------------------------------
    liquid_total | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            risk |  -.0031183   .0017344    -1.80   0.095    -.0068652    .0006287
           _cons |   .1274527   .0213314     5.97   0.000     .0813689    .1735364
    ------------------------------------------------------------------------------
    
    . reg liquid_sales risk
    
          Source |       SS           df       MS      Number of obs   =        15
    -------------+----------------------------------   F(1, 13)        =      0.01
           Model |  .000134106         1  .000134106   Prob > F        =    0.9228
        Residual |  .178746043        13  .013749696   R-squared       =    0.0007
    -------------+----------------------------------   Adj R-squared   =   -0.0761
           Total |  .178880149        14  .012777153   Root MSE        =    .11726
    
    ------------------------------------------------------------------------------
    liquid_sales | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            risk |  -.0005435   .0055032    -0.10   0.923    -.0124325    .0113455
           _cons |   .7386332   .0676847    10.91   0.000     .5924094     .884857
    ------------------------------------------------------------------------------
    
    . 
    .
    I expected that liquid_sales will be more responsive to risk than liquid_total as in former both, numerator and denominator (sales) are highly correlated with risk. How come this happens? Eventhough in first regression denominator was less correlated with risk it is showing more significance than second regression in which denominator is highly correlated with risk. What is happening here?

  • #2
    Neelakanda:
    since in both regressions you cannot rule out that risk=0, I think that you're concern should be relaxed.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Thanks Carlo Lazzaro for the help. I hope by 0, you meant that both betas have confidence intervals from - to +. Is that the case you meant?

      I am sorry if I couldn't communicate well my problem and, my doubt is more general which I will elaborate on here.

      Consider outcome variables as proportions or ratios, for instance, let the dependent variable be a ratio in the form of Y/Z, where Y be, say income, and Z be wealth (current income scaled by total wealth). Also, assume the variable of interest is education (E) in years (or its log). Hence, Y/Z is a function of E. Now the question was while running a regression does it really matter whether Z (wealth) was the scaling matter or not at least practically? This is because, in my example, education (E) can impact the deflating factor (wealth-W), for instance, good education (E) can help in teaching where to invest and where to work thereby increasing wealth (z) over a period of time. In such case if we run a regression of Y/Z on E, can we check whether results are driven by impact of E on Y through Z. Theoretically, Z was used as deflating factor to address the concern that large wealth households may have larger error terms and small wealth households may have smaller error terms (i.e., errors are heteroscedastic). But here I suspect that E can impact Y through Z and in that case will our results be biased.

      In context of finance while regressing Leverage (Total debt divided by total assets) on risk to check whether Leverage reacts to risk, Johnson (2018) argue that relation simply reflected the denominator of leverage ratios (asset values ) contracting with uncertainty, with the numerator adjusting slowly
      Similarly, Bartlett and Partnoy (2020) argues that
      "We use the term “ratio problem” to describe two challenges – omitted variable and measurement error bias – that arise anytime a researcher uses linear regression with a ratio output. Intuitively, the central problem is a bias that arises when a right-side input of interest is correlated with either (i) the reciprocal of the scale factor, 1/n, or (ii) the scaled version of other right-side input variables. Such correlation can arise either because the input variable of interest is also scaled by n, or because it is not scaled, but is otherwise correlated with 1/n or the interaction of 1/n with other right-side input variables. Bias can alternatively arise when the denominator of the ratio is measured with error"

      The above paper was there in a blog that cited the Stata command -fracreg- (https://www.elsblog.org/the_empirica...holarship.html) that helps to deal with fractional variables. The article "Stata Tip 63: Modeling Proportions" is also somewhat related I guess

      So to wrap up how to know that denominators of the dependent variable do the job of scaling or deflating only and not contribute to the relation between the numerator and independent variable of interest.

      References"
      1. Economic Uncertainty, Aggregate Debt, and the Real Effects of Corporate Finance-Timothy C. Johnson

      2. The Ratio Problem-Robert Bartlett Frank Partnoy

      Last edited by Neelakanda Krishna; 29 Mar 2022, 08:26.

      Comment


      • #4
        Neelakanda:
        sorry, but I cannot help any further with this issue.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Dear Carlo Lazzaro
          Thanks for the support and my apologies for posing these questions which seems to be a little odd. However, if at all you have some thoughts on this please let me know later.

          Comment

          Working...
          X