My vocabulary has failed me. I used to refer to this as "indexing" a variable across values of another variable, but I believe I have been misusing that term.
I think the best way to explain is with an example.
I have data on human subjects. Participants are seen three times. Records are uniquely identified by subject_id and visit_date. Participant gender is collected only at the first visit. I need to "X" gender so that it appears on each record for that participant. What is the correct word to describe what I mean by X?
Here's my data, and here's my code to do X:
clear all
input subject_id str20 visit_date str2 sex
1 "19mar23" "F"
1 "30mar23" ""
1 "10apr23" ""
2 "20mar23" "M"
2 "31mar23" ""
2 "8apr23" ""
3 "22mar23" "F"
3 "1apr23" ""
end
And then I use this code to do X (what would you call X?)
list
bysort subject_id: replace sex = sex[_n-1] if sex[_n-1] !=""
bysort subject_id: replace sex = sex[_n] if sex[_N] !="" /* Just in case someone accidentally has "gender" entered at their second visit */
list
I'm not sure how to even search on this concept, but this is a technique I find I use frequently (where a baseline value needs to be applied to all records of the same participant) and if I'm worried I'm not using the right term to discuss it.
I think the best way to explain is with an example.
I have data on human subjects. Participants are seen three times. Records are uniquely identified by subject_id and visit_date. Participant gender is collected only at the first visit. I need to "X" gender so that it appears on each record for that participant. What is the correct word to describe what I mean by X?
Here's my data, and here's my code to do X:
clear all
input subject_id str20 visit_date str2 sex
1 "19mar23" "F"
1 "30mar23" ""
1 "10apr23" ""
2 "20mar23" "M"
2 "31mar23" ""
2 "8apr23" ""
3 "22mar23" "F"
3 "1apr23" ""
end
And then I use this code to do X (what would you call X?)
list
bysort subject_id: replace sex = sex[_n-1] if sex[_n-1] !=""
bysort subject_id: replace sex = sex[_n] if sex[_N] !="" /* Just in case someone accidentally has "gender" entered at their second visit */
list
I'm not sure how to even search on this concept, but this is a technique I find I use frequently (where a baseline value needs to be applied to all records of the same participant) and if I'm worried I'm not using the right term to discuss it.
Comment