
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Numerical Variable and Dummies for Each of It's Categories Simultaneously


    My data is arranged as panel data. I have observations on the characteristics of each child ever born to each woman, and instead of time, my panel considers the birth order of each child (that is, whether it is the first born, the second born and so on).

    I want to study how the birth of a male child affects the time interval until the following birth. However, birth intervals are differently affected by the number of children, in general, and the number of sons previously born. In addition, the impact of birth order is non-linear. As such, I want to include controls for the number of children already born, for the number of sons previously born and binary variables that identify each birth order.

    However, the variable counting the number of children already born and the birth orders is the same, and I wonder if it would be wrong to include the two in simultaneous.

    Thank you in advance

  • #2
    However, the variable counting the number of children already born and the birth orders is the same, and I wonder if it would be wrong to include the two in simultaneous.
    Well, you can include both variables if you like, but because they are the same, Stata will omit one of them from any analysis you attempt with the data, so I wouldn't waste the keystrokes if I were you.

    Referring to the title of your post, which poses a different question, it is possible to include both a numeric variable and indicators ("dummies") for each of its categories (except two reference categories) in the same analysis. It is a rather odd model, and interpreting it is complicated, but there are rare situations where it is appropriate. However, what you describe does not sound to me like one of them. The main issue here is the non-linearity. Given that the number of sons or children previously born will be limited to a relatively small number of levels, the simplest and most direct way to deal with the non-linearity, in the absence of a theoretical basis for choosing some specific functional model such as quadratic or logarithmic or exponential, etc., is to use only the indicator variables, and not include the original variable as a term in the model.

