factor variables may not contain non integer values

Shiping Tang

Join Date: Nov 2015

Posts: 8
#1

factor variables may not contain non integer values

17 Nov 2015, 01:12

I know somebody has asked this question before, but I found the answers to be unhelpful. It seems that STATA automatically treat some data as factor variables. But I do not want those variables to be treated as "factor variables". How can I change the status of this variable to make it just a normal numeric variable? Thanks
Tags: None

1 like

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

17 Nov 2015, 01:21

Shiping:
you may try something along the lines of the following example:

Code:

. use auto.dta
(1978 Automobile Data)

. regress price foreign


. regress price i.rep78

      Source |       SS       df       MS              Number of obs =      69
-------------+------------------------------           F(  4,    64) =    0.24
       Model |  8360542.63     4  2090135.66           Prob > F      =  0.9174
    Residual |   568436416    64     8881819           R-squared     =  0.0145
-------------+------------------------------           Adj R-squared = -0.0471
       Total |   576796959    68  8482308.22           Root MSE      =  2980.2

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       rep78 |
          2  |   1403.125   2356.085     0.60   0.554    -3303.696    6109.946
          3  |   1864.733   2176.458     0.86   0.395    -2483.242    6212.708
          4  |       1507   2221.338     0.68   0.500    -2930.633    5944.633
          5  |     1348.5   2290.927     0.59   0.558    -3228.153    5925.153
             |
       _cons |     4564.5   2107.347     2.17   0.034     354.5913    8774.409
------------------------------------------------------------------------------

. regress price c.rep78

      Source |       SS       df       MS              Number of obs =      69
-------------+------------------------------           F(  1,    67) =    0.00
       Model |  24770.7652     1  24770.7652           Prob > F      =  0.9574
    Residual |   576772188    67  8608540.12           R-squared     =  0.0000
-------------+------------------------------           Adj R-squared = -0.0149
       Total |   576796959    68  8482308.22           Root MSE      =    2934

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       rep78 |   19.28012   359.4221     0.05   0.957    -698.1295    736.6897
       _cons |   6080.379    1274.06     4.77   0.000     3537.345    8623.413
------------------------------------------------------------------------------

However, the meaningfulness of what above depends on which values your variable takes on.

Kind regards,
Carlo
(Stata 19.0)

Comment

Shiping Tang

Join Date: Nov 2015

Posts: 8
#3

17 Nov 2015, 09:33

It worked! (I am using a different dataset, but I see the logic). Thanks a lot, Carlo. But is there a way to unmake a "factor variable" (into a regular numeric variable)?
Comment
Shiping Tang

Join Date: Nov 2015

Posts: 8
#4

17 Nov 2015, 09:43

HI, Carol. It worked for the regression with interactive terms. But it won't work with the margin estimation. Here is what I have:

. reg cpi c.elf##c.polright, r

Linear regression Number of obs = 86
F( 3, 82) = 37.95
Prob > F = 0.0000
R-squared = 0.4456
Root MSE = 1.808

----------------------------------------------------------------------------------
| Robust
cpi | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
elf | 3.073329 1.328065 2.31 0.023 .4313854 5.715273
polright | 1.006441 .1797779 5.60 0.000 .6488049 1.364076
|
c.elf#c.polright | -.590187 .2873289 -2.05 0.043 -1.161776 -.0185983
|
_cons | 2.105206 .5455919 3.86 0.000 1.01985 3.190562
----------------------------------------------------------------------------------

. margins c.elf#c.polright
only factor variables and their interactions are allowed
r(198);

. margins elf#polright
elf: factor variables may not contain noninteger values
r(452);

Did I still do something wrong?
Comment
daniel klein

Join Date: Mar 2014

Posts: 3850
#5

17 Nov 2015, 09:51

margins will not let you enter continuous predictors as marginlist as this would mean you want the marginal effect for each level of a continuous predictor - which hardly makes sense. Chances are you want something along the lines

Code:

margins , dydx(elf polright)

or perhaps

Code:

summarize elf local min = r(min) local max = r(max) local mean = r(mean) margins polright , at(elf = (`min' `mean' `max')) marginsplot

although in a linear model, you can see the marginal effect by just looking at the coefficients.

Also note there is no such thing as a marginal effect for the interaction term (i.e. the product term) itself; see this post by Vince Wiggins, but that might be another story.

Best
Daniel
Comment
Shiping Tang

Join Date: Nov 2015

Posts: 8
#6

17 Nov 2015, 10:54

That works! Thanks Dan.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4987
#7

17 Nov 2015, 11:33

To be clear, if you do

reg y x1 x2

both x1 and x2 are treated as continuous variables. But if you do

reg y x1 x2 x1#x2

In the the interaction they will be treated as categorical variables. Hence, when specifying interactions with continuous vars specify

reg y x1 x2 c.x1#c.x2

If I ruled the world the default would always be continuous and you would specify i. as needed. As it is, Stata is basically using different defaults for interaction terms (assume vars are categorical unless specified otherwise) and non-interactions (assume continuous unless specified otherwise).

If you want to be super-safe you can always use c. and i., even when you are replicating the default behavior. Doing so forces yourself to always be clear in your own mind whether a variable is categorical or continuous.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
4 likes
Comment
Shiping Tang

Join Date: Nov 2015

Posts: 8
#8

17 Nov 2015, 15:47

Thanks so much, Richard. Yes, I am also reading the earlier post you guys had.
Comment

Announcement

factor variables may not contain non integer values

Comment

Comment

Comment

Comment

Comment

Comment

Comment