Interpretation of margins - comparing continuous and categorical version of the same variable

Patrick Dickson

Join Date: Jun 2017
Posts: 12

Interpretation of margins - comparing continuous and categorical version of the same variable

25 Jun 2018, 07:35

I am struggling with the interpretation of margins when comparing models that use continuous covariates compared to the same model turned into a categorical variable. An example is below.

Code:

use auto, clear


//Create a categorical price variable 

egen price_categorical=cut(price), at(3000(1500)16000) label

//Analysis models

*First, estimate a model with price as  continuous covariate, and calculate margins at representative values of price

glm mpg price turn weight,link(log) family(gamma)

margins, at(price=(3000(1500)16000))

marginsplot

*Second, estimate a model with price as a categorical variable, and calculate margins at these values of price

glm mpg i.price_c turn weight ,link(log) family(gamma)

margins price_c

marginsplot

The margins output of each command is

Code:

First model

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   21.88968   .6042113    36.23   0.000     20.70545    23.07392
          2  |   21.56811   .4389286    49.14   0.000     20.70783     22.4284
          3  |   21.25127   .3765895    56.43   0.000     20.51317    21.98937
          4  |   20.93908    .449832    46.55   0.000     20.05742    21.82073
          5  |   20.63147   .6038433    34.17   0.000     19.44796    21.81498
          6  |   20.32838   .7871422    25.83   0.000     18.78561    21.87115
          7  |   20.02975   .9791415    20.46   0.000     18.11067    21.94883
          8  |    19.7355   1.172169    16.84   0.000     17.43809    22.03291
          9  |   19.44558   1.363024    14.27   0.000      16.7741    22.11705
------------------------------------------------------------------------------

Second model


Predictive margins                              Number of obs     =         73
Model VCE    : OIM

Expression   : Predicted mean mpg, predict()

-----------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------+----------------------------------------------------------------
price_categorical |
           3000-  |   22.08453   .6492414    34.02   0.000     20.81204    23.35702
           4500-  |   21.53456   .5972071    36.06   0.000     20.36405    22.70506
           6000-  |    20.0408    .954368    21.00   0.000     18.17027    21.91133
           7500-  |   22.60106   1.844822    12.25   0.000     18.98528    26.21685
           9000-  |   19.95937   1.415486    14.10   0.000     17.18507    22.73367
          10500-  |   19.34198   1.655145    11.69   0.000     16.09796    22.58601
          12000-  |   16.25751   1.644919     9.88   0.000     13.03353    19.48149
          13500-  |   18.72548   1.919951     9.75   0.000     14.96245    22.48852
-----------------------------------------------------------------------------------

I understand that margins

Code:

price_categorical

is comparing hypothetical populations in which individuals are modelled as having values of this variable other than their actual value but the same values on all other covariates, and predicted mean reflects this difference. How does this differ from the

Code:

margins, at(price=(3000(1500)16000))

formulation? A naive view might expect Model 1 - continuous covariate calculated at representative values - to produce the same marginal effects/predicted means as margins of the continuous variable at representative values, but no doubt there is an error in that interpretation...

Tags: None

Richard Williams

Join Date: Apr 2014

Posts: 4942
#2

25 Jun 2018, 07:43

If you look at the glm estimates, you should see that they are not the same, i.e. treating price as a continuous variable will produce different coefficients than when you treat it as categorical. , Therefore the marginal effects will not be the same either.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Patrick Dickson

Join Date: Jun 2017

Posts: 12
#3

25 Jun 2018, 07:54

Originally posted by Richard Williams View Post

If you look at the glm estimates, you should see that they are not the same, i.e. treating price as a continuous variable will produce different coefficients than when you treat it as categorical. , Therefore the marginal effects will not be the same either.

Hi Richard, thank you, yes of course, these are different models.

I am still struggling with the interpretation of the -at- command for representative values. Say price was the main covariate of interest in my model, e.g. I wanted to see the marginal effect of price on my outcome, controlling for other variables. Does it ever make sense to use -at - for the main effect of interest, rather than for example measure the marginal effect of price at different values of other covariates, e.g. at different values of a binary covariate such as foreign (which I realise is not in my model but serves to illustrate the example)
Comment

Announcement

Interpretation of margins - comparing continuous and categorical version of the same variable

Comment

Comment