Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of margins - comparing continuous and categorical version of the same variable

    I am struggling with the interpretation of margins when comparing models that use continuous covariates compared to the same model turned into a categorical variable. An example is below.

    Code:
    use auto, clear
    
    
    //Create a categorical price variable 
    
    egen price_categorical=cut(price), at(3000(1500)16000) label
    
    //Analysis models
    
    *First, estimate a model with price as  continuous covariate, and calculate margins at representative values of price
    
    glm mpg price turn weight,link(log) family(gamma)
    
    margins, at(price=(3000(1500)16000))
    
    marginsplot
    
    *Second, estimate a model with price as a categorical variable, and calculate margins at these values of price
    
    glm mpg i.price_c turn weight ,link(log) family(gamma)
    
    margins price_c
    
    marginsplot
    The margins output of each command is

    Code:
    First model
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   21.88968   .6042113    36.23   0.000     20.70545    23.07392
              2  |   21.56811   .4389286    49.14   0.000     20.70783     22.4284
              3  |   21.25127   .3765895    56.43   0.000     20.51317    21.98937
              4  |   20.93908    .449832    46.55   0.000     20.05742    21.82073
              5  |   20.63147   .6038433    34.17   0.000     19.44796    21.81498
              6  |   20.32838   .7871422    25.83   0.000     18.78561    21.87115
              7  |   20.02975   .9791415    20.46   0.000     18.11067    21.94883
              8  |    19.7355   1.172169    16.84   0.000     17.43809    22.03291
              9  |   19.44558   1.363024    14.27   0.000      16.7741    22.11705
    ------------------------------------------------------------------------------
    
    Second model
    
    
    Predictive margins                              Number of obs     =         73
    Model VCE    : OIM
    
    Expression   : Predicted mean mpg, predict()
    
    -----------------------------------------------------------------------------------
                      |            Delta-method
                      |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
    price_categorical |
               3000-  |   22.08453   .6492414    34.02   0.000     20.81204    23.35702
               4500-  |   21.53456   .5972071    36.06   0.000     20.36405    22.70506
               6000-  |    20.0408    .954368    21.00   0.000     18.17027    21.91133
               7500-  |   22.60106   1.844822    12.25   0.000     18.98528    26.21685
               9000-  |   19.95937   1.415486    14.10   0.000     17.18507    22.73367
              10500-  |   19.34198   1.655145    11.69   0.000     16.09796    22.58601
              12000-  |   16.25751   1.644919     9.88   0.000     13.03353    19.48149
              13500-  |   18.72548   1.919951     9.75   0.000     14.96245    22.48852
    -----------------------------------------------------------------------------------
    I understand that margins
    Code:
    price_categorical
    is comparing hypothetical populations in which individuals are modelled as having values of this variable other than their actual value but the same values on all other covariates, and predicted mean reflects this difference. How does this differ from the
    Code:
    margins, at(price=(3000(1500)16000))
    formulation? A naive view might expect Model 1 - continuous covariate calculated at representative values - to produce the same marginal effects/predicted means as margins of the continuous variable at representative values, but no doubt there is an error in that interpretation...



  • #2
    If you look at the glm estimates, you should see that they are not the same, i.e. treating price as a continuous variable will produce different coefficients than when you treat it as categorical. , Therefore the marginal effects will not be the same either.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Originally posted by Richard Williams View Post
      If you look at the glm estimates, you should see that they are not the same, i.e. treating price as a continuous variable will produce different coefficients than when you treat it as categorical. , Therefore the marginal effects will not be the same either.
      Hi Richard, thank you, yes of course, these are different models.

      I am still struggling with the interpretation of the -at- command for representative values. Say price was the main covariate of interest in my model, e.g. I wanted to see the marginal effect of price on my outcome, controlling for other variables. Does it ever make sense to use -at - for the main effect of interest, rather than for example measure the marginal effect of price at different values of other covariates, e.g. at different values of a binary covariate such as foreign (which I realise is not in my model but serves to illustrate the example)

      Comment

      Working...
      X