Converting multiple household to one

Kazi Aiman Udoy

Join Date: Dec 2022

Posts: 33
#1

Converting multiple household to one

27 Feb 2024, 22:06

In my dataset there is a section where it gathers information regarding whether household faced any kind of shocks. There are 15 kinds of shocks in the list under the variable shock. Therefore, PSU and HHID is being repeated 15 times for each HHID and PSU. So if original household size is 14735. Now in this section, it becomes 221025 (15*14735). Now I want to create simply a variable which states household faced any shock or not (yes = 1 and no =0 ) and then keep or drop dataset in such a way that it becomes in HH level and reduced to original HH size 14735 without dropping the other variables in this section. I am able to create the shock variable. I have run the following code;

Code:

use HH_SEC_6B.dta, clear drop TERM sort PSU HHID sum HHID gen any_shock = S6BQ02 != 2 egen any_shock_HH = max(any_shock), by(PSU HHID)

Now I am facing difficulties in writing the appropriate code to drop or keep datasets to make it original size of 14735 household. For example, I want to keep only the first observation of HHID under PSU. If under PSU 1 HHID is being repeated 15 times. I want only 1 of it. Actually I can't merge the file without doing. Any kind of cooperation will be highly appreciated.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#2

28 Feb 2024, 02:34

gen any_shock = S6BQ02 != 2

You have to ensure that "S6BQ02" is never missing. Otherwise, any household with a missing observation on this variable will be recorded as having a shock.

Code:

gen any_shock = S6BQ02 != 2 & !missing(S6BQ02)

Either

Code:

bys PSU HHID: keep if _n==1

or

Code:

duplicates drop *, force

will do.
Comment

Announcement

Converting multiple household to one

Comment