Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting multiple household to one

    In my dataset there is a section where it gathers information regarding whether household faced any kind of shocks. There are 15 kinds of shocks in the list under the variable shock. Therefore, PSU and HHID is being repeated 15 times for each HHID and PSU. So if original household size is 14735. Now in this section, it becomes 221025 (15*14735). Now I want to create simply a variable which states household faced any shock or not (yes = 1 and no =0 ) and then keep or drop dataset in such a way that it becomes in HH level and reduced to original HH size 14735 without dropping the other variables in this section. I am able to create the shock variable. I have run the following code;

    Code:
    use HH_SEC_6B.dta, clear 
    
    drop TERM
    
    sort PSU HHID
    sum HHID
    gen any_shock = S6BQ02 != 2
    egen any_shock_HH = max(any_shock), by(PSU HHID)
    Now I am facing difficulties in writing the appropriate code to drop or keep datasets to make it original size of 14735 household. For example, I want to keep only the first observation of HHID under PSU. If under PSU 1 HHID is being repeated 15 times. I want only 1 of it. Actually I can't merge the file without doing. Any kind of cooperation will be highly appreciated.

  • #2
    gen any_shock = S6BQ02 != 2
    You have to ensure that "S6BQ02" is never missing. Otherwise, any household with a missing observation on this variable will be recorded as having a shock.

    Code:
    gen any_shock = S6BQ02 != 2 & !missing(S6BQ02)
    Either

    Code:
    bys PSU HHID: keep if _n==1
    or

    Code:
    duplicates drop *, force
    will do.

    Comment

    Working...
    X