Fellow statalisters,
I would really appreciate if there is a smarter way to define labels. I have 10years of data containing roughly 3million observations and 106 variables each year. Out of those 106 I need to define labels for 50 variables and the labels vary across years. to get an idea please see the dataex below.
B3_q5=religion having values Hinduism-1, Islam-2, Christianity –3, Sikhism-4, Jainism-5, Buddhism-6, Zoroastrianism-7, others-9
B3_q10= type of house structure: pucca-1, semi-pucca-2, serviceable katcha –3, unserviceable katcha – 4, no structure-5
However B5_q1 has values 100,101,102,.....,339 signifying different food for example 100 signifies rice 101 signifies potato 102 signifies radish, etc.
B10_q1 has values 420,421,422,...,549 signifying expenditure on various things such as 420 signifies medical expense, 430 signifies movies expense, etc.
Labelling them in the manner below manually would be quite demanding
label define B3_q5 1 "Hinduism" 2 "Islam" 3 "Christianity" 4 "Sikhism" 5 "Jainism" 6 "Buddhism" 7 "Zoroastrianism" 9 "others"
label values B3_q5 B3_q5
label define B3_q10 1 "pucca" 2 "semi pucca" 3 "serviceable katcha" 4 "unserviceable katcha" 5 "no structures"
label values B3_q10 B3_q10
I want to know if there is a smarter way how I can approach this. for example, can stata read a codebook somehow and label them accordingly?
I would really appreciate if there is a smarter way to define labels. I have 10years of data containing roughly 3million observations and 106 variables each year. Out of those 106 I need to define labels for 50 variables and the labels vary across years. to get an idea please see the dataex below.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte(B3_q5 B3_q10) int(B5_q1 B10_q1) 1 1 327 467 1 1 282 459 1 1 301 459 1 1 129 479 1 1 280 459 1 1 279 519 1 1 179 459 1 1 309 502 1 1 229 459 1 1 148 459 1 1 308 459 1 1 174 453 1 1 289 459 1 1 288 459 1 1 190 459 1 1 214 459 1 1 191 459 1 1 290 459 1 1 102 459 1 1 211 459 1 1 212 459 1 1 329 459 1 1 159 459 1 1 180 454 1 1 309 454 1 1 148 502 1 1 214 453 1 1 290 420 1 1 288 479 1 1 189 454 1 1 159 454 1 1 229 454 1 1 174 450 1 1 301 454 1 1 211 454 1 1 129 467 1 1 308 429 1 1 279 454 1 1 102 519 1 1 212 454 1 1 280 454 1 1 289 459 1 1 190 454 1 1 191 454 1 1 327 454 1 1 303 454 1 1 179 454 1 1 282 454 1 1 329 454 1 1 140 510 1 1 174 452 1 1 290 539 1 1 152 483 1 1 108 452 1 1 301 452 1 1 287 452 1 1 103 479 1 1 240 452 1 1 245 420 1 1 259 492 1 1 207 472 1 1 201 452 1 1 309 451 1 1 190 452 1 1 308 470 1 1 261 452 1 1 216 452 1 1 289 452 1 1 169 499 1 1 285 429 1 1 111 452 1 1 251 437 1 1 202 493 1 1 211 467 1 1 282 452 1 1 291 549 1 1 300 456 1 1 288 540 1 1 160 459 1 1 279 453 1 1 249 452 1 1 214 502 1 1 102 443 1 1 191 452 1 1 229 452 1 1 159 494 1 1 221 519 1 1 179 439 1 1 222 452 1 1 269 452 1 1 283 457 1 1 230 454 1 1 164 452 1 1 129 450 1 1 280 452 1 1 256 449 1 1 283 483 1 1 245 451 1 1 280 492 1 1 190 510 end label values B3_q5 B3_q5 label def B3_q5 1 "Hinduism", modify label values B3_q10 B3_q10 label def B3_q10 1 "pucca", modify
B3_q5=religion having values Hinduism-1, Islam-2, Christianity –3, Sikhism-4, Jainism-5, Buddhism-6, Zoroastrianism-7, others-9
B3_q10= type of house structure: pucca-1, semi-pucca-2, serviceable katcha –3, unserviceable katcha – 4, no structure-5
However B5_q1 has values 100,101,102,.....,339 signifying different food for example 100 signifies rice 101 signifies potato 102 signifies radish, etc.
B10_q1 has values 420,421,422,...,549 signifying expenditure on various things such as 420 signifies medical expense, 430 signifies movies expense, etc.
Labelling them in the manner below manually would be quite demanding
label define B3_q5 1 "Hinduism" 2 "Islam" 3 "Christianity" 4 "Sikhism" 5 "Jainism" 6 "Buddhism" 7 "Zoroastrianism" 9 "others"
label values B3_q5 B3_q5
label define B3_q10 1 "pucca" 2 "semi pucca" 3 "serviceable katcha" 4 "unserviceable katcha" 5 "no structures"
label values B3_q10 B3_q10
I want to know if there is a smarter way how I can approach this. for example, can stata read a codebook somehow and label them accordingly?
Comment