Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running csdid with repeated cross-section data - Issue of control variables

    Hi everyone,

    I am looking at the impact of a policy on the income of individuals. The policy was implemented at the state level in a staggered way, and my data is cross-sectional survey data. So, every year, I observe different individuals in the state. I would like to run csdid but I am having trouble understanding how to deal with time-constant control variables. Controls like educational level are really important, but the help file says - "be careful of controlling for characteristics that are either time constant (e.g., sex or race), or for pretreatment characteristics." I am not sure whether I can add these or not then.



  • #2
    You can add them, but need to be aware of the assumptions requiered and implications if those fail.
    in RC every variables is effectively time varying. So the next assumption is for them to be stationary

    Comment


    • #3
      Let me just add a couple of comments. I think it's useful to distinguish between variables that are dated prior to the intervention and those that may actually change, for an individual, due to the intervention. In a panel data set, these are relatively easy to separate. If I'm looking at adults well into their work history, and a variable is highest grade completed by age 25, then I can assume this is not affected by any intervention. But if I'm studying a job training program, I don't necessarily want to control for current marital status. In repeated cross sections, the issue is often compositional effects. For example, if a state expands Medicaid, it might get an influx of less healthy people from other areas. Then, controlling for pre-intervention health can create issues. But, it can also help address compositional effects. That's why it's tricky.

      Fernando is correct that, mechanically, all covariates in an RC are effectively time-varying. But it is useful to break them into truly time-varying characteristics and those that may change for an individual due to the intervention. We usually want to avoid the latter, whether it is panel data or RC. As Fernando implies, decisions on variables dated pre-intervention are generally tricky. If the population is stable -- stationary -- then it becomes easier to justify.

      Comment


      • #4
        Thank you, Fernando and Professor Wooldridge. This is really helpful.

        Comment

        Working...
        X