In my dataset there is a section where it gathers information regarding whether household faced any kind of shocks. There are 15 kinds of shocks in the list under the variable shock. Therefore, PSU and HHID is being repeated 15 times for each HHID and PSU. So if original household size is 14735. Now in this section, it becomes 221025 (15*14735). Now I want to create simply a variable which states household faced any shock or not (yes = 1 and no =0 ) and then keep or drop dataset in such a way that it becomes in HH level and reduced to original HH size 14735 without dropping the other variables in this section. I am able to create the shock variable. I have run the following code;
Now I am facing difficulties in writing the appropriate code to drop or keep datasets to make it original size of 14735 household. For example, I want to keep only the first observation of HHID under PSU. If under PSU 1 HHID is being repeated 15 times. I want only 1 of it. Actually I can't merge the file without doing. Any kind of cooperation will be highly appreciated.
Code:
use HH_SEC_6B.dta, clear drop TERM sort PSU HHID sum HHID gen any_shock = S6BQ02 != 2 egen any_shock_HH = max(any_shock), by(PSU HHID)
Comment