Hello everyone,
Above is a sample of the data I have for a specific year. It is individual level data with individual Id "indid", his or her four-digit occupation "four_digit", his/her employment status "usempstp" and the expansion weight for the data "expan_indiv". I want to collapse the data by the "four_digit" variable to get the weighted total number of individuals working in each occupation (Note: all observations have a value of 1 for the "usempstp" variable) . I ran the following command:
However, the collapsed data set gives me unweighted number of individuals for each occupation and I can't figure out why
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str23 indid long four_digit byte usempstp float expan_indiv "9801000101" 7212 1 2046.778 "9801000304" 7231 1 2046.778 "9801000308" 2330 1 2046.778 "9801000401" 5414 1 2046.778 "9801000501" 8121 1 2046.778 "9801000601" 7233 1 2046.778 "9801000701" 7114 1 2046.778 "9801000901" 9613 1 2046.778 "9801001003" 3119 1 2046.778 "9801001004" 2341 1 2046.778 end label values four_digit four_digit label def four_digit 2330 "Secondary Education Teachers", modify label def four_digit 2341 "Primary School Teachers", modify label def four_digit 3119 "Physical and Engineering Science Technicians Not Elsewhere Classified", modify label def four_digit 5414 "Security Guards", modify label def four_digit 7114 "Concrete Placers, Concrete Finishers and Related Workers", modify label def four_digit 7212 "Welders and Flame Cutters", modify label def four_digit 7231 "Motor Vehicle Mechanics and Repairers", modify label def four_digit 7233 "Agricultural and Industrial Machinery Mechanics and Repairers", modify label def four_digit 8121 "Metal Processing Plant Operators", modify label def four_digit 9613 "Sweepers and Related Labourers", modify label values usempstp empstat label def empstat 1 "Waged employee", modify
Above is a sample of the data I have for a specific year. It is individual level data with individual Id "indid", his or her four-digit occupation "four_digit", his/her employment status "usempstp" and the expansion weight for the data "expan_indiv". I want to collapse the data by the "four_digit" variable to get the weighted total number of individuals working in each occupation (Note: all observations have a value of 1 for the "usempstp" variable) . I ran the following command:
Code:
collapse (sum) usempstp [aw=expan_indiv], by(four_digit)
Comment