Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Measuring Spells - two special cases in panel data

    Hi Statalist.

    I have code to identify and measure the length of a spell - the spell relates to couples being in a relationship (either married or de facto), with the 'length' of the spell counting each wave (year) the same couple report to be together, however, my code* fails when it encounters the following two infrequent, but recurring, issues.

    1) The first issue relates to a special case of missing data. I am sorting on the respondent ID. which is a problem when respondent data is missing. I would like address this by replacing the ID's missing response with their partner's response (assuming that is also not missing and assuming the same partner ID).

    2) The second issue relates to sorting on the respondent ID - which results in the spell length increasing in each wave that the respondent answers the question, regardless if the respondent has changed partners. That is, ID1 and ID2 may have had a seven-year 'spell', then ID1 has another seven-year 'spell' with ID4, but 'length' counts this as one 14 year spell. I recognise that issues (1) and (2) could be largely addressed by including the partner ID (p_id) but I am not sure how that would work exactly. I included 'p_id' in my code below, but am not sure if I've written this correctly and whether I should have sorted by both id and p_id in each line of code or if there is a better way to address this.

    3) I would like to know how I should treat notable gaps in responses by couples. e.g. a couple enters the survey at wave 4, then (the same couple) returns to the survey in wave 12 until wave 18 (the latest wave). I think it best to exclude the couple's response in wave 4 and only count from wave 12 on? if so, how would I code that?

    Find below my code and sample data. I appreciate any help with addressing these issues.
    Code:
    tsset id wave 
    bys id (wave): gen byte begin = inlist(marstat, 1, 2) & (marstat != marstat[_n-1])
    bys id (wave): gen byte spell = inlist(marstat, 1, 2) 
    bys id spell (wave): gen byte end = _n == _N  
    bys id spell (wave): gen seq = _n 
    bys id (p_id): egen length = count(seq)
    *Note: The above code was developed based on my understanding of Cox (2002, 2007, 2015, ...) and an earlier thread https://www.statalist.org/forums/for...-relationships.

    Code:
     * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id p_id marstat p_marstat) byte(wave begin end spell) float(seq length)
    1001  1009 . 2  1 0 0 0  1 10
    1001  1009 . 2  3 0 1 0  2 10
    1001  1009 1 1  5 1 0 1  1 10
    1001  1009 1 1  6 0 0 1  2 10
    1001  1009 1 1  8 0 0 1  3 10
    1001  1009 1 1  9 0 0 1  4 10
    1001  1009 1 1 10 0 0 1  5 10
    1001  1009 2 1 11 1 0 1  6 10
    1001  1009 2 1 12 0 0 1  7 10
    1001  1406 2 . 14 0 1 1  8 10
    1188  1189 1 1  1 1 0 1  1 13
    1188  1189 1 1  2 0 0 1  2 13
    1188  1189 1 1  3 0 0 1  3 13
    1188  1189 1 1  4 0 0 1  4 13
    1188  1189 1 1  5 0 0 1  5 13
    1188  1189 1 1  6 0 0 1  6 13
    1188  1189 1 1  7 0 0 1  7 13
    1188  1189 1 1  8 0 0 1  8 13
    1188 11740 2 2 11 1 0 1  9 13
    1188 11740 2 2 12 0 0 1 10 13
    1188 11740 2 2 13 0 0 1 11 13
    1188 17316 2 2 17 0 0 1 12 13
    1188 17316 2 2 18 0 1 1 13 13
    1191  1192 1 1  1 1 0 1  1 12
    1191  1192 1 1  2 0 0 1  2 12
    1191  1192 1 1  3 0 0 1  3 12
    1191  1192 1 1  4 0 0 1  4 12
    1191  1192 1 1  5 0 0 1  5 12
    1191  1192 1 1  6 0 0 1  6 12
    1191  1192 1 1  7 0 0 1  7 12
    1191  1192 1 1  8 0 0 1  8 12
    1191  1192 . 1  9 0 0 0  1 12
    1191  1192 . 1 10 0 0 0  2 12
    1191  1192 . 1 11 0 0 0  3 12
    1191  1192 . 1 12 0 1 0  4 12
    1192  1191 1 1  1 1 0 1  1 12
    1192  1191 1 1  2 0 0 1  2 12
    1192  1191 1 1  3 0 0 1  3 12
    1192  1191 1 1  4 0 0 1  4 12
    1192  1191 1 1  5 0 0 1  5 12
    1192  1191 1 1  6 0 0 1  6 12
    1192  1191 1 1  7 0 0 1  7 12
    1192  1191 1 1  8 0 0 1  8 12
    1192  1191 1 .  9 0 0 1  9 12
    1192  1191 1 . 10 0 0 1 10 12
    1192  1191 1 . 11 0 0 1 11 12
    1192  1191 1 . 12 0 1 1 12 12
    1074  1075 1 1  1 1 0 1  1  7
    1074  1075 1 1  2 0 0 1  2  7
    1074  1075 1 1  3 0 0 1  3  7
    1074  1075 . 1  4 0 0 0  1  7
    1074  1075 . 1  5 0 0 0  2  7
    1074  1075 . 1  6 0 0 0  3  7
    1074  1075 . 3  7 0 1 0  4  7
    1075  1074 1 1  1 1 0 1  1 14
    1075  1074 1 1  2 0 0 1  2 14
    1075  1074 1 1  3 0 0 1  3 14
    1075  1074 1 .  4 0 0 1  4 14
    1075  1074 1 .  5 0 0 1  5 14
    1075  1074 1 .  6 0 0 1  6 14
    1075  1074 3 .  7 0 0 0  1 14
    1075 12427 2 . 12 1 0 1  7 14
    1075 12427 2 . 13 0 0 1  8 14
    1075 12427 2 . 14 0 0 1  9 14
    1075 12427 2 . 15 0 0 1 10 14
    1075 12427 2 . 16 0 0 1 11 14
    1075 12427 2 . 17 0 0 1 12 14
    1075 12427 2 . 18 0 1 1 13 14
    end
    * Should one remove duplicate entries in the data if they are the same as in 1191 and 1192, but not when they differ as in 1074 and 1075? And if so, potential code would be appreciated. (I could post this last question separately if preferred).
Working...
X