According to us house election data a single county for a specific state belongs to multiple congressional districts. However, based on the total geographical area I can match unique county to a specific district for that state.
In the following data nhgisnam means the name of the county , district means Congressional district , cnty_area means the total geographical area of that county and cnty_part_area means the portion of total area of that county belonging to that particular congressional district. For example : In th first line for Autauga county the cnty_part_area's value is 1564828723. That means in district 2 Autauga county's total area is 1564828723 - out of its total area (cnty_area) of 1565529773. Autauga county also belong to district 6 and district 7 of state 1. But, the major portion of it's area belong to district 2 which I can figure out from that cnty_part_area variable.
Can anyone kindly guide me how I can code the data so that for each county in a particular state I can keep the observation where each county is assigned to a single district in that state based on the highest value of cnty_part_area variable for that specific county ??
In the following data nhgisnam means the name of the county , district means Congressional district , cnty_area means the total geographical area of that county and cnty_part_area means the portion of total area of that county belonging to that particular congressional district. For example : In th first line for Autauga county the cnty_part_area's value is 1564828723. That means in district 2 Autauga county's total area is 1564828723 - out of its total area (cnty_area) of 1565529773. Autauga county also belong to district 6 and district 7 of state 1. But, the major portion of it's area belong to district 2 which I can figure out from that cnty_part_area variable.
Can anyone kindly guide me how I can code the data so that for each county in a particular state I can keep the observation where each county is assigned to a single district in that state based on the highest value of cnty_part_area variable for that specific county ??
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str12 nhgisnam byte(cd_statefip district) float county double(cnty_area cd_area cnty_part_area) "Autauga" 1 2 1 1565529773 27471139257 1564828723 "Autauga" 1 2 1 1565529773 27471139257 .000155932 "Autauga" 1 6 1 1565529773 12040322876 272866.6476 "Autauga" 1 6 1 1565529773 12040322876 .133551671 "Autauga" 1 7 1 1565529773 22740093504 428181.5623 "Baldwin" 1 1 3 4232265763 16566047005 4228366412 "Baldwin" 1 1 3 4232265763 16566047005 .556012851 "Baldwin" 1 7 3 4232265763 22740093504 149732.2219 "Barbour" 1 2 5 2342716428 27471139257 .836149027 "Barbour" 1 2 5 2342716428 27471139257 2341574170 "Barbour" 1 2 5 2342716428 27471139257 .361295549 "Barbour" 1 2 5 2342716428 27471139257 .707204586 "Barbour" 1 3 5 2342716428 20681592749 .553967503 "Barbour" 1 3 5 2342716428 20681592749 601774.0707 "Bibb" 1 6 7 1621774445 12040322876 .08641998 "Bibb" 1 6 7 1621774445 12040322876 .212694208 "Bibb" 1 6 7 1621774445 12040322876 1621293818 "Bibb" 1 7 7 1621774445 22740093504 480626.7196 "El Dorado" 6 3 17 4631169089 8860803999 .049026728 "El Dorado" 6 3 17 4631169089 8860803999 844075.6488 "El Dorado" 6 4 17 4631169089 44439160065 4630300584 "El Dorado" 6 4 17 4631169089 44439160065 1.154827007 "Fresno" 6 17 19 15585347209 12459983416 939434.4757 "Fresno" 6 18 19 15585347209 8030525729 104149507.3 "Fresno" 6 19 19 15585347209 17561371799 921125720.7 "Fresno" 6 20 19 15585347209 12921313091 6140610563 "Fresno" 6 21 19 15585347209 20952217690 8417589179 "Fresno" 6 25 19 15585347209 56000534108 932808.8571 "Glenn" 6 1 21 3437311730 28853910260 146543.9293 "Glenn" 6 2 21 3437311730 56920751720 .407955162 "Glenn" 6 2 21 3437311730 56920751720 .481264625 "Glenn" 6 2 21 3437311730 56920751720 3437165187 "Kern" 6 20 29 21138168964 12921313091 3175426923 "Kern" 6 21 29 21138168964 20952217690 507400.1421 "Kern" 6 22 29 21138168964 27074266282 1.937903539 end
Comment