Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analyzing dataset where individuals are sampled at multiple random points in time

    I have a data frame with the variables Judge ID (uniquely identifies judges), Case ID (uniquely identifies court cases), Decision (records case outcome), and Comp Date (variable specifying case completion date). Below, I have provided a table to illustrate what this data might look like for a set of four judges between August 27 and August 30, 2009:
    Judge_ID Case_ID Decision Comp_Date
    XDF 1993 Conviction 27aug2009
    XDF 2047 Relief 27aug2009
    XDF 893 Conviction 30aug2009
    JCF 431 Conviction 27aug2009
    XYQ 4449 Conviction 28aug2009
    XYQ 8481 Conviction 28aug2009
    XYQ 2199 Relief 28aug2009
    TBX 7832 Relief 27aug2009

    Each observation in the dataset corresponds to a unique case. Some judges oversee more cases than others, and case completion date is random across judges. Is this an unbalanced panel dataset? I read that unbalanced panel data is defined as when at least one panel unit (e.g. a judge) is not observed every period. However, in this dataset, a judge may go many days without completing a case. In addition, it is common for a judge to complete more than one case on the same day. If this data frame is not unbalanced panel data, what type of statistical data is it? Can I only analyze it as cross-sectional data?

  • #2
    It certainly is not panel data, when the same judge can complete multiple cases on the same date. In terms of "only analyz[ing] it as cross-sectional data," let's not worry about the semantics* here and focus on substance. If you were planning an analysis that relies on lagged or forward observations of variables, then, no, you can't do that because it is impossible to define lags and leads when judge and date combined do not uniquely identify observations in the data. Similarly, you can forget about error variance structure that is autoregressive. But if you don't need to do any of those things, you can still -xtset Judge_ID- and use the usual panel-data estimators like -xtreg, fe- etc.

    *This is certainly not cross-sectional data either, because the same judges are observed repeatedly. Since the judges are observed repeatedly, but do not meet the full requirements of panel data, I would just refer to this as longitudinal data with repeated measures.

    Comment


    • #3
      This is certainly not cross-sectional data either, because the same judges are observed repeatedly. Since the judges are observed repeatedly, but do not meet the full requirements of panel data, I would just refer to this as longitudinal data with repeated measures.
      Thank you for the clarification, Clyde. I figured running fixed-effects regression would be valid given the structure of my dataset, but it is nice to have that certainty.

      Comment

      Working...
      X