Dear Statlist users,
Firstly, I am a new user of Statalist, so I apologize in advance if I am not using dataex properly to describe my data.
I am trying to prepare my data for analysis and running into some problems. I have respondent-level data in long format on up to 5 members in their network. I am now running into a separate issue with the variables that identify whether the network members know each other (variable is called know_* where the * indicates whether person 1 knows person 2, such as know_1_2 etc.). A separate variable n* indicates the initials of the pair e.g. n1_2 is EM BK. The goal here is to create an edge list so that I can do social network analysis (information on the nodes is in a separate file in long format, by householdid on each of the names mentioned in the network). I had wanted the final edge list dataset to look like the below:
A tranche of the current data is copied below (the text in red corresponds to the data in red above)
Example generated by -dataex-. To install: ssc install dataex
clear
input str5 householdid str2 names_ str5(n1_2 n1_3 n2_3) byte(know_1_2 know_1_3 know_2_3)
"10101" "BK" "EM BK" "EM PK" "BK PK" 1 1 1
"10101" "EM" "EM BK" "EM PK" "BK PK" 1 1 1
"10101" "PK" "EM BK" "EM PK" "BK PK" 1 1 1
end
label values know_1_2 know_1_2
label def know_1_2 1 "Yes", modify
label values know_1_3 know_1_3
label def know_1_3 1 "Yes", modify
label values know_2_3 know_2_3
label def know_2_3 1 "Yes", modify
I would appreciate any advice on how to go about doing this. I've read some of the help files for reshaping but haven't come across a solution, or maybe this is just a simple thing that I don't quite know how to do yet?
Thank you!
Firstly, I am a new user of Statalist, so I apologize in advance if I am not using dataex properly to describe my data.
I am trying to prepare my data for analysis and running into some problems. I have respondent-level data in long format on up to 5 members in their network. I am now running into a separate issue with the variables that identify whether the network members know each other (variable is called know_* where the * indicates whether person 1 knows person 2, such as know_1_2 etc.). A separate variable n* indicates the initials of the pair e.g. n1_2 is EM BK. The goal here is to create an edge list so that I can do social network analysis (information on the nodes is in a separate file in long format, by householdid on each of the names mentioned in the network). I had wanted the final edge list dataset to look like the below:
householdid | name pair | know |
10101 | EM BK | Yes |
10101 | EM PK | Yes |
10101 | BK PK | Yes |
householdid | names_ | names_repeat_count | n1_2 | n1_3 | n1_4 | n2_3 | n2_4 | n3_4 | count | person | know_1_1 | know_1_2 | know_1_3 | know_1_4 | know_1_5 | know_2_1 | know_2_2 | know_2_3 |
10101 | BK | 3 | EM BK | EM PK | BK PK | 1 | 2 | Yes | Yes | Yes | ||||||||
10101 | EM | 3 | EM BK | EM PK | BK PK | 1 | 1 | Yes | Yes | Yes | ||||||||
10101 | PK | 3 | EM BK | EM PK | BK PK | 1 | 3 | Yes | Yes | Yes |
clear
input str5 householdid str2 names_ str5(n1_2 n1_3 n2_3) byte(know_1_2 know_1_3 know_2_3)
"10101" "BK" "EM BK" "EM PK" "BK PK" 1 1 1
"10101" "EM" "EM BK" "EM PK" "BK PK" 1 1 1
"10101" "PK" "EM BK" "EM PK" "BK PK" 1 1 1
end
label values know_1_2 know_1_2
label def know_1_2 1 "Yes", modify
label values know_1_3 know_1_3
label def know_1_3 1 "Yes", modify
label values know_2_3 know_2_3
label def know_2_3 1 "Yes", modify
I would appreciate any advice on how to go about doing this. I've read some of the help files for reshaping but haven't come across a solution, or maybe this is just a simple thing that I don't quite know how to do yet?
Thank you!
Comment