Best way to introduce categorical control variables

Ujjwal

Join Date: Jul 2014

Posts: 56
#1

Best way to introduce categorical control variables

02 Aug 2014, 09:23

Hi, I am trying to find the effect of financial distress on subjective well being. I have already built an Index of financial distress from eight ordinal variables using factor analysis. I have rescaled both life satisfaction and index of financial distress at a 0-1 scale. Now I want to introduce some categorical control variables before running the fixed effect panel data model. Control variables are for example (sex - male/female, job status - emp, selfemp, unemp, retired, fulltimestudent, others, marital status - unmarried, married, livingascouple, widow, divorced, education - higherdegree, A level, O level, others). My question is what would be the best way to introduce these categorical variables in fixed effect panel data. I have 12 year unbalanced panel for 110,000 observations (in total). Any suggestion is highly appreciated.
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4945
#2

02 Aug 2014, 09:35

Note that gender and any other time-invariant variables will drop out in a fixed effects model (although you could still have interactions with gender). Other than that, I am not sure what else to tell you, other than to use factor variable notation, e.g.

Code:

xtreg y i.empstat i.marstat i.educ, fe

If you are talking about sequencing of models, I suppose you could have one model with the control variables followed by the model with the explanatory variables, or vice versa. The main thing to be careful of is that missing data doesn't change the cases analyzed as you add more variables.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Ujjwal

Join Date: Jul 2014

Posts: 56
#3

02 Aug 2014, 11:24

Thanks Richard, i.empstat - will it be an intercept dummy or interaction dummy? I generated intercept dummy using tab, gen (e)....e1 e2 e3. Should I use interaction dummy?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

02 Aug 2014, 12:47

i.empstat will give you an "intercept" dummy. In modern Stata there are many advantages to using this factor-variable approach to categorical variables, rather than using -tab, gen()-. So I would get rid of e1 e2 and e3, and use i.empstat for that.

As for whether you should use interaction dummies, that depends on your theoretical model of the data generating process. If you expect there to be interactions between empstat and other variables, then include interactions. If you think the effect of empstat will be the same, independent of the values of other variables, then don't include interactions. That's a question of the science of your field that requires expertise in that domain, not just statistical expertise, to answer..
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4945
#5

02 Aug 2014, 13:00

I agree with Clyde, factor variables are almost always the way to go. For more on them, from within Stata type

help fvvarlist

Besides saving you the trouble of computing the dummy variables yourself, factor variables can be very useful with post-estimation commands. If interested, see

http://www3.nd.edu/~rwilliam/stats3/Margins01.pdf

Disclaimer: I link to my own handouts, not because they are so spectacular, but because I can find their URLs in 2 seconds. There are lots of other sources on the web and elsewhere that you may find more helpful.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement

Best way to introduce categorical control variables

Comment

Comment

Comment

Comment