I have a panel data of individuals over 4 years. Each person has an id (avs_nbr), a gender variable, a date of birth, a year of observation and a household id to know who lives with who. Each person also has a variable that refers to their relatives. For ex., an individual has a variable father_id that takes as value the avs_nbr of their father. Same for mother, partners, children, etc. However, these are not 100% reliable as some children can have a missing value for parents, some married couple a missing value for partners, etc.
I am interested in births happening between october 1st 2020 and mars 31st 2021. I can easily detect if a child is born within this window like this :
Here I flag all children born in the desired window, then flag the household they belong to, and drop all households not witnessing births within that window.
What I would like to do now is to create a flag for mothers and fathers that takes value == 1 if that person is the parent of the child born in the 6-months window.
I tried doing it so
My sample has 41'405 births, and this method detects 49'709 fathers. I need to fine tune it more. I do not know if it is the correct approach. If not, what is ? If yes, how do I improve its accuracy ?
I cannot give a code snippet as the provider wishes the data to remain confidential.
Thank you very much !
I am interested in births happening between october 1st 2020 and mars 31st 2021. I can easily detect if a child is born within this window like this :
Code:
gen born_in_range = inrange(birthday,date("1oct2020", "DMY"),date("31mar2021","DMY")) hashsort householdid by householdid : egen hh_born_in_range = max(born_in_range) keep if hh_born_in_range == 1 hashsort householdid avs_nbr year gen birthyear = year(birthday) replace born_in_range =0 if year!=birthyear /// this corrects the variable born_in_range such that it doesn't = 1 for a child over the whole obs period but only = 1 on the year of birth
What I would like to do now is to create a flag for mothers and fathers that takes value == 1 if that person is the parent of the child born in the 6-months window.
I tried doing it so
Code:
*Generate flag for children gen is_child =(year-birthyear<17) *Need to identify fathers hashsort householdid year by householdid year : egen witnessed_birth = max(born_in_range) /// want to create a variable at the household-year level == 1 if there was a birth in that year. hashsort householdid avs_nbr year gen is_father = 0 replace is_father = 1 if witnessed_birth == 1 & sex == 1 & is_child == 0 tab is_father,m
I cannot give a code snippet as the provider wishes the data to remain confidential.
Thank you very much !
Comment