Dear all
I want to generate an id for states in my dataset using a loop but I am unsure how to go about it.
I have a dataset of 2240 observations and 70 variables, but 12 variables and 11 observations should be sufficient to show the problem. In my dataset, I have locations and countries of 6 different firms who provide funds to the same observation. For each states and countries across the world, I have created identification codes and I want to use those codes to create id for these 6 locations and countries in my dataset. I have already created the ids for the countries of each of the 6 firms (invctny*_id) in my dataset. But now, I want to do the same for the locations of the firms in their respective countries. If it was just one firm, this is the some of the code I would have used:
generate invhq_id = .
replace invhq_id = 3387 if invhq == "Zaporizhzhya" & invctny1_id == 240
replace invhq_id = 3388 if invhq == "Zhytomyr" & invctny1_id == 240
replace invhq_id = 3389 if invhq == "Abu Dhabi" & invctny1_id == 241
replace invhq_id = 3390 if invhq == "Ajman" & invctny1_id == 241
replace invhq_id = 3391 if invhq == "Dubai" & invctny1_id == 241
replace invhq_id = 3392 if invhq == "Fujairah" & invctny1_id == 241
replace invhq_id = 3393 if invhq == "Ras Al-Khaimah" & invctny1_id == 241
replace invhq_id = 3394 if invhq == "Sharjah" & invctny1_id == 241
replace invhq_id = 3395 if invhq == "Umm al-Qaywayn" & invctny1_id == 241
replace invhq_id = 3396 if invhq == "England" & invctny1_id == 242
replace invhq_id = 3397 if invhq == "Northern Ireland" & invctny1_id == 242
replace invhq_id = 3398 if invhq == "Scotland" & invctny1_id == 242
replace invhq_id = 3399 if invhq == "Wales" & invctny1_id == 242
replace invhq_id = 3400 if invhq == "Baker" & invctny1_id == 244
However, I have close to 6 locations and would rather use a loop to save time. An example of my dataset is given below:
input str26 invhq1 str20 invhq2 str80 invhq3 str19 invhq4 str18(invhq5 invhq6) float(invctny1_id invctny2_id invctny3_id invctny4_id invctny5_id invctny6_id)
"Berlin" "Stockholms" "" "" "" "" 86 222 . . . .
"Geneva" "London" "" "" "" "" 223 257 . . . .
"Lagos" "" "" "" "" "" 163 . . . . .
"Region Metropolitana" "" "" "" "" "" 48 . . . . .
"Accra" "" "" "" "" "" 87 . . . . .
"London" "" "" "" "" "" 257 . . . . .
"" "" "" "" "" "" . . . . . .
"London" "Massachusetts" "New York" "Ile-de-France" "Ile-de-France" "Ile-de-France" 257 243 243 79 79 79
"Washington" "" "" "" "" "" 243 . . . . .
"California" "Fribourg" "Lombardia" "London" "" "" 243 223 112 257 . .
"Colorado" "" "" "" "" "" 243 . . . . .
end
Please how do I create write the code?
Thanks in anticipation
I want to generate an id for states in my dataset using a loop but I am unsure how to go about it.
I have a dataset of 2240 observations and 70 variables, but 12 variables and 11 observations should be sufficient to show the problem. In my dataset, I have locations and countries of 6 different firms who provide funds to the same observation. For each states and countries across the world, I have created identification codes and I want to use those codes to create id for these 6 locations and countries in my dataset. I have already created the ids for the countries of each of the 6 firms (invctny*_id) in my dataset. But now, I want to do the same for the locations of the firms in their respective countries. If it was just one firm, this is the some of the code I would have used:
generate invhq_id = .
replace invhq_id = 3387 if invhq == "Zaporizhzhya" & invctny1_id == 240
replace invhq_id = 3388 if invhq == "Zhytomyr" & invctny1_id == 240
replace invhq_id = 3389 if invhq == "Abu Dhabi" & invctny1_id == 241
replace invhq_id = 3390 if invhq == "Ajman" & invctny1_id == 241
replace invhq_id = 3391 if invhq == "Dubai" & invctny1_id == 241
replace invhq_id = 3392 if invhq == "Fujairah" & invctny1_id == 241
replace invhq_id = 3393 if invhq == "Ras Al-Khaimah" & invctny1_id == 241
replace invhq_id = 3394 if invhq == "Sharjah" & invctny1_id == 241
replace invhq_id = 3395 if invhq == "Umm al-Qaywayn" & invctny1_id == 241
replace invhq_id = 3396 if invhq == "England" & invctny1_id == 242
replace invhq_id = 3397 if invhq == "Northern Ireland" & invctny1_id == 242
replace invhq_id = 3398 if invhq == "Scotland" & invctny1_id == 242
replace invhq_id = 3399 if invhq == "Wales" & invctny1_id == 242
replace invhq_id = 3400 if invhq == "Baker" & invctny1_id == 244
However, I have close to 6 locations and would rather use a loop to save time. An example of my dataset is given below:
input str26 invhq1 str20 invhq2 str80 invhq3 str19 invhq4 str18(invhq5 invhq6) float(invctny1_id invctny2_id invctny3_id invctny4_id invctny5_id invctny6_id)
"Berlin" "Stockholms" "" "" "" "" 86 222 . . . .
"Geneva" "London" "" "" "" "" 223 257 . . . .
"Lagos" "" "" "" "" "" 163 . . . . .
"Region Metropolitana" "" "" "" "" "" 48 . . . . .
"Accra" "" "" "" "" "" 87 . . . . .
"London" "" "" "" "" "" 257 . . . . .
"" "" "" "" "" "" . . . . . .
"London" "Massachusetts" "New York" "Ile-de-France" "Ile-de-France" "Ile-de-France" 257 243 243 79 79 79
"Washington" "" "" "" "" "" 243 . . . . .
"California" "Fribourg" "Lombardia" "London" "" "" 243 223 112 257 . .
"Colorado" "" "" "" "" "" 243 . . . . .
end
Please how do I create write the code?
Thanks in anticipation
Comment