Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • In regression analysis, is the unit of the predictor rounded in any way?

    In regression analysis, I'm familiar with the saying that the coefficient reported is an effect on y of a one unit increase in x. But at what level is that unit?

    You will see below that I am regressing an increase in the percent who are unemployed in a region (psum_unemployed_total_cont_y) on a person self reporting their health as "good" in a linear probability model.

    The common way to describe these results seems to be:

    Here an additional unit of regional unemployment decreases the probability of reporting good health by 1 percentage points.

    However, regional unemployment is recorded from three to four percentage points, i.e.:

    Code:
    . sum psum_unemployed_total_cont_y
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
    psum_unemp~y |      3,198    10.72768    5.007287       5.41      26.15
    So, my question is, is this "additional unit of regional unemployment" an increase of a full percent, say from 5% of people in the region being unemployed to 6% of people in the region being unemployed. Or, is it an increase from, for example, 5.41% to 5.42% of people being unemployed?

    I realise this may be a simple question, but I'm struggling with it so I appreciate any feedback,

    Kindest regards,

    John

    Code:
    . xtreg binary_health_y psum_unemployed_total_cont_y i.yrlycurrent_county_y1 i.year age_y i.maritalstatus_y if has_y0_questionna
    > ire==1 & has_y5_questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & has_y5_quest
    > ionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 | has_y0_questionnair
    > e==1 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_y10 !=
    > . & has_y10_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionnaire==1 & cbm
    > i_y10 !=. & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 & cbmi_y10 !=. & has_y1
    > 0_questionnaire==1, cluster (current_county_y1) fe robust 
    note: 6.yrlycurrent_county_y1 omitted because of collinearity
    note: 10.yrlycurrent_county_y1 omitted because of collinearity
    note: 34.yrlycurrent_county_y1 omitted because of collinearity
    note: 37.yrlycurrent_county_y1 omitted because of collinearity
    note: 44.yrlycurrent_county_y1 omitted because of collinearity
    note: 45.yrlycurrent_county_y1 omitted because of collinearity
    note: 48.yrlycurrent_county_y1 omitted because of collinearity
    note: 10.year omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs      =      1581
    Group variable: id                              Number of groups   =       641
    
    R-sq:  within  = 0.0502                         Obs per group: min =         1
           between = 0.0015                                        avg =       2.5
           overall = 0.0000                                        max =         3
    
                                                    F(18,28)           =         .
    corr(u_i, Xb)  = -0.6875                        Prob > F           =         .
    
                                         (Std. Err. adjusted for 29 clusters in current_county_y1)
    ----------------------------------------------------------------------------------------------
                                 |               Robust
                 binary_health_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------------------+----------------------------------------------------------------
    psum_unemployed_total_cont_y |  -.0102855   .0048773    -2.11   0.044    -.0202762   -.0002947
                                 |
           yrlycurrent_county_y1 |
                          Cavan  |   -.018267   .2279823    -0.08   0.937    -.4852675    .4487335
                          Clare  |  -.5252115   .2027821    -2.59   0.015    -.9405917   -.1098313
                           Cork  |   .3906244   .2787621     1.40   0.172    -.1803939    .9616427
                        Donegal  |          0  (omitted)
                         Dublin  |   .1718237   .2250034     0.76   0.451    -.2890749    .6327223
                      Dublin 10  |  -.3292192   .3856107    -0.85   0.400    -1.119107    .4606685
                      Dublin 11  |          0  (omitted)
                      Dublin 12  |  -.3186343   .4037992    -0.79   0.437    -1.145779    .5085107
                      Dublin 14  |  -.1434118   .2760698    -0.52   0.608    -.7089152    .4220916
                      Dublin 15  |  -.5067857   .4255742    -1.19   0.244    -1.378535    .3649634
                      Dublin 16  |   .6643994   .4025214     1.65   0.110    -.1601284    1.488927
                      Dublin 18  |  -.5712014   .3942685    -1.45   0.159    -1.378824     .236421
                      Dublin 22  |   .1140665   .2243428     0.51   0.615     -.345479    .5736119
                      Dublin 24  |   .0648586   .2443927     0.27   0.793    -.4357571    .5654742
                       Dublin 4  |   .7202541   .3574293     2.02   0.054    -.0119065    1.452415
                       Dublin 6  |  -.1533908   .2734907    -0.56   0.579    -.7136112    .4068296
                      Dublin 6W  |   .4818072   .1730873     2.78   0.010      .127254    .8363604
                       Dublin 7  |     .41868   .2419016     1.73   0.094    -.0768329     .914193
                       Dublin 8  |   .0001566   .2054587     0.00   0.999    -.4207065    .4210197
                    Dublin City  |  -.1254077   .2432794    -0.52   0.610    -.6237429    .3729275
         DĂșn Laoghaire-Rathdown  |  -.0052935   .2089167    -0.03   0.980    -.4332399    .4226529
                         Fingal  |  -.6266997   .3963005    -1.58   0.125    -1.438484     .185085
                         Galway  |  -.1784478   .2249192    -0.79   0.434     -.639174    .2822783
                    Galway City  |  -.0723957   .2240071    -0.32   0.749    -.5312535    .3864621
                          Kerry  |  -.5674608   .3000446    -1.89   0.069    -1.182074    .0471527
                        Kildare  |   .5304143   .1807288     2.93   0.007     .1602081    .9006205
                       Kilkenny  |          0  (omitted)
                          Laois  |  -.0744377   .0605301    -1.23   0.229    -.1984279    .0495525
                        Leitrim  |   .5543939   .1868227     2.97   0.006      .171705    .9370828
                       Limerick  |          0  (omitted)
                       Longford  |   .1264136   .2223292     0.57   0.574     -.329007    .5818343
                          Louth  |   .2030122   .2291028     0.89   0.383    -.2662835     .672308
                           Mayo  |   .7999382    .224004     3.57   0.001     .3410868     1.25879
                          Meath  |  -.1253544   .3558635    -0.35   0.727    -.8543077    .6035989
                       Monaghan  |  -.2798706   .4098558    -0.68   0.500    -1.119422     .559681
                         Offaly  |   .0461998   .2170761     0.21   0.833    -.3984605      .49086
                      Roscommon  |          0  (omitted)
                          Sligo  |          0  (omitted)
                   South Dublin  |   .7029383   .3213452     2.19   0.037     .0446926    1.361184
                      Tipperary  |  -.0259743   .0451327    -0.58   0.570    -.1184245    .0664759
                Tipperary North  |          0  (omitted)
                      Waterford  |   -.639648    .222072    -2.88   0.008    -1.094542   -.1847542
                      Westmeath  |   .0048053   .2286901     0.02   0.983    -.4636452    .4732558
                        Wexford  |   .1496822   .2144573     0.70   0.491    -.2896137     .588978
                        Wicklow  |   .3035418   .2539436     1.20   0.242    -.2166381    .8237218
                                 |
                            year |
                              5  |  -.0829668    .032689    -2.54   0.017    -.1499272   -.0160065
                             10  |          0  (omitted)
                                 |
                           age_y |   .0071394   .0045045     1.58   0.124    -.0020876    .0163664
                                 |
                 maritalstatus_y |
                     Cohabiting  |    .048518   .0302435     1.60   0.120     -.013433     .110469
                      Separated  |  -.0904769   .2361229    -0.38   0.704    -.5741527     .393199
                       Divorced  |  -.1477058   .0902324    -1.64   0.113    -.3325386    .0371269
                        Widowed  |   .0203674   .3064814     0.07   0.947    -.6074313    .6481661
           Single/Never married  |   .0082783   .0634848     0.13   0.897    -.1217644     .138321
                                 |
                           _cons |   .5915398    .181058     3.27   0.003     .2206592    .9624204
    -----------------------------+----------------------------------------------------------------
                         sigma_u |  .48800391
                         sigma_e |  .35656453
                             rho |  .65194878   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------------

  • #2
    The answer is your first guess "this "additional unit of regional unemployment" an increase of a full percent, say from 5% of people in the region being unemployed to 6% of people in the region being unemployed." But you need to be careful how that percentage data is introduced to the model. Sometimes people multiply percentage by 100 so 12% is entered as 12 as opposed to 0.12. If it is entered as 0.12 then you need to multiply your coefficient estimate with 0.01 not 1 to figure out what the expected change in dep var is for 1 percent change in unemployment as 0.01 would mean 1%. If it is entered as 12 then you need to multiply the coefficient with 1.
    Last edited by Oscar Ozfidan; 09 Feb 2021, 07:45.

    Comment


    • #3
      Hi Oscar,

      Thank you very much for your response.

      If I am understanding you correctly, Stata would consider an X variable that increases from 5.41% to 6.41%, but ignore an increase from 5.41% to 5.90% or indeed from 5.41% to 6%? Could you give me a sense of why it takes this approach? I assume there may be some people who wish to examine smaller increases in x on y?

      To your point, with the percentage unemployment variable being recorded as "5.41 up to 26.15" should I then multiply the coefficient that is reported (-.0102855) by 100?

      Thank you for your feedback.

      Kindest regards,

      John

      Comment


      • #4
        No No! Stata does not round anything. it considers any change and factors it in. If your unemployment rate is recorded as 5.41 as representing 5.41% and your coefficient is -.0102855, that means one percent increase in unemployment causes your dep var to change by 1*-.0102855. i.e a reduction of 0.0102855 in the dependent variable. Usually one unit change is used to convey the idea for ease of interpretation by you can scale it if you want to. So, if you want to figure out what the change in dep var would be if unemployment goes up by 15% (from 5.41% to 20.41%) the impact on dep var would be estimated by 15*-.0102855.

        Comment


        • #5
          Ah OK! So if x (unemployment) was coded as a percentage point, and when I studied it longitudinally between two waves it went from 5.41 to 5.42 then Stata would report the coefficient of this effect on y?

          I got confused and thought you were saying that Stata needed a change that was of a full unit or the data just got dropped, i.e. that it needed a change in x of 1.00 and that a change of 0.01 would be too small for the regression to run.

          Comment


          • #6
            Originally posted by John Adler View Post
            Ah OK! So if x (unemployment) was coded as a percentage point, and when I studied it longitudinally between two waves it went from 5.41 to 5.42 then Stata would report the coefficient of this effect on y?

            I got confused and thought you were saying that Stata needed a change that was of a full unit or the data just got dropped, i.e. that it needed a change in x of 1.00 and that a change of 0.01 would be too small for the regression to run.
            The coefficients that Stata (or any software) reports are based on the raw scale of your dependent variable. It looks like your unemployment is inherently coded in percentage points. You seem to have a mean of 10.73 percentage points unemployment, smallest value 5.41 percentage points, standard deviation 5.00 percentage points. The beta on unemployment means that a full percent point change in unemployment is associated with a change of -0.010 units of whatever the dependent variable is.

            If the mean unemployment were the same and it was coded in the same units, but the standard deviation of unemployment were 0.05 percentage points, the beta would still be on the same scale as above.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment


            • #7
              Originally posted by Weiwen Ng View Post
              The beta on unemployment means that a full percent point change in unemployment is associated with a change of -0.010 units of whatever the dependent variable is.

              Thank you so much for your input. Can I clarify, the coefficient states that if unemployment rose by 0.01 percentage points the dependent variable changes by -0.010 units, because unemployment is coded in several digits, i.e. 5.41.


              Kindest regards,

              John

              Comment


              • #8
                Originally posted by John Adler View Post


                Thank you so much for your input. Can I clarify, the coefficient states that if unemployment rose by 0.01 percentage points the dependent variable changes by -0.010 units, because unemployment is coded in several digits, i.e. 5.41.


                Kindest regards,

                John
                A one unit change in unemployment is 1 percentage point the way it's coded right now. To get the change in the DV associated with a 0.01 percentage point change in unemployment, you need to multiply the beta by 0.01, so that's a very small change in the DV (but a 0.01 percentage point change in unemployment is also very small).
                Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                Comment


                • #9
                  Thank you Weiwen, but I'm confused, what if my x variable never increases across the waves of the data by a full percentage point? What if it only ever increases by 0.01, then how is the effect of x on y a percentage point increase? Shouldn't the coefficient reported then be a basis point (0.01) increase of x on y? And if I wanted to scale up to a full percentage point increase I would multiply this coefficient by 100?

                  Thanks again for your time,

                  John

                  Comment

                  Working...
                  X