Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create the variable "age of the head of the household"

    Hi everyone!

    - I'm kind of stuck with the data cleaning process of my master thesis because I don't know how to create a variable that reflects the age of the head of the household in each observation. So far I've managed to create the variable sex of the head of the household using the following code:

    *Generate var sex HH head*

    bysort hhid : gen sexhhhead=1 if relhhhead==1 & sex>1 & sex<=2
    replace sexhhhead=0 if sexhhhead==.
    egen sexhhhead1 = max(sexhhhead), by (hhid)


    - This code is helpful in case of dealing with dummy variables or categorical variables, I've also applied a similar code for generating the variable education of the head of the household (years of education):

    bysort hhid : gen educhhhead=1 if relhhhead==1 & yearseduc>0 & yearseduc<=1
    replace educhhhead=0 if educhhhead==.
    egen educhhhead1 = max(educhhhead), by (hhid)

    drop educhhhead


    bysort hhid : gen educhhhead=2 if relhhhead==1 & yearseduc>1 & yearseduc<=2
    replace educhhhead=0 if educhhhead==.
    egen educhhhead2 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=3 if relhhhead==1 & yearseduc>2 & yearseduc<=3
    replace educhhhead=0 if educhhhead==.
    egen educhhhead3 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=4 if relhhhead==1 & yearseduc>3 & yearseduc<=4
    replace educhhhead=0 if educhhhead==.
    egen educhhhead4 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=5 if relhhhead==1 & yearseduc>4 & yearseduc<=5
    replace educhhhead=0 if educhhhead==.
    egen educhhhead5 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=6 if relhhhead==1 & yearseduc>5 & yearseduc<=6
    replace educhhhead=0 if educhhhead==.
    egen educhhhead6 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=7 if relhhhead==1 & yearseduc>6 & yearseduc<=7
    replace educhhhead=0 if educhhhead==.
    egen educhhhead7 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=8 if relhhhead==1 & yearseduc>7 & yearseduc<=8
    replace educhhhead=0 if educhhhead==.
    egen educhhhead8 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=9 if relhhhead==1 & yearseduc>8 & yearseduc<=9
    replace educhhhead=0 if educhhhead==.
    egen educhhhead9 = max(educhhhead), by (hhid)

    drop educhhhead

    bysort hhid : gen educhhhead=10 if relhhhead==1 & yearseduc>9 & yearseduc<=10
    replace educhhhead=0 if educhhhead==.
    egen educhhhead10 = max(educhhhead), by (hhid)

    replace educhhhead1=. if educhhhead1==0
    replace educhhhead2=. if educhhhead2==0
    replace educhhhead3=. if educhhhead3==0
    replace educhhhead4=. if educhhhead4==0
    replace educhhhead5=. if educhhhead5==0
    replace educhhhead6=. if educhhhead6==0
    replace educhhhead7=. if educhhhead7==0
    replace educhhhead8=. if educhhhead8==0
    replace educhhhead9=. if educhhhead9==0
    replace educhhhead10=. if educhhhead10==0

    gen educhhhead=max(educhhhead1, educhhhead2, educhhhead3, educhhhead4, educhhhead5,educhhhead6, educhhhead7, educhhhead8, educhhhead9, educhhhead10, educhhhead11, educhhhead12, educhhhead13, educhhhead14, educhhhead15, educhhhead16, educhhhead17, educhhhead18)



    - However, since in my data sample the age of the head of the household ranges from 10 to 97, following my previous code methodology would mean an extremely long code, year by year, and that would make me waste a lot of time.

    Any idea/recommendation? Thanks a lot!!!!

    Daniel.
    Attached Files
    Last edited by Daniel Perez Parra; 16 May 2022, 05:03.

  • #2
    perhaps,
    Code:
    foreach v in age sex yearseduc {
    gen `v'hhhead = `v' if relhhhead == 1
    bysort hhid (`v'hhhead): replace `v'hhhead = `v'hhhead[1]
    }

    Comment


    • #3
      Originally posted by Øyvind Snilsberg View Post
      perhaps,
      Code:
      foreach v in age sex yearseduc {
      gen `v'hhhead = `v' if relhhhead == 1
      bysort hhid (`v'hhhead): replace `v'hhhead = `v'hhhead[1]
      }
      It works!!!! Thanks a lot

      Comment


      • #4
        Consider also some variation on a loop over


        Code:
         
         egen `v'hhhead = mean(cond(relhhhead == 1, `v', .)), by(hhid)
        For more see e.g. section 9 of https://www.stata-journal.com/articl...article=dm0055

        If there is more than one household head, this may not do what you want, so consider min(), max() or whatever it is that you do want.

        Comment

        Working...
        X