Hi All,
I am trying to match children and parents to each other using their IDs. I have code that works, but it takes very long (a few hours). I currently have about 3.1 million observations, with 100,214 unique IDs. Here is a small example of the data and my code:
ID = each person's unique ID
ID_M = ID of a person's mother
ID_AM = ID of a person's adopted mother
ID_F = ID of a person's father
ID_AF = ID of a person's adopted father
I am creating binary variables "parents_present," "f_children_present," "m_children_present," "af_children_present," "am_children_present" that indicate if
1) a person has parents present
2) a father has children present
3) a mother has children present
4) an adopted father has children present
5) an adopted mother has children present.
Is there are a way to do this more quickly? Any suggestions would be appreciated!
Best,
Cora
g parents_present = 0
levelsof ID, local(levels)
foreach l of local levels {
replace parents_present = 1 if inlist(`l',ID_F,ID_M,ID_AF,ID_AM)
}
g f_children_present = 0
levelsof ID_F, local(levels)
foreach l of local levels {
replace f_children_present = 1 if inlist(`l',ID)
}
g m_children_present = 0
levelsof ID_M, local(levels)
foreach l of local levels {
replace m_children_present = 1 if inlist(`l',ID)
}
g af_children_present = 0
levelsof ID_AF, local(levels)
foreach l of local levels {
replace af_children_present = 1 if inlist(`l',ID)
}
g am_children_present = 0
levelsof ID_AM, local(levels)
foreach l of local levels {
replace am_children_present = 1 if inlist(`l',ID)
}
************************************************** ************************************************** *****************
************************************************** ************************************************** *****************
I am trying to match children and parents to each other using their IDs. I have code that works, but it takes very long (a few hours). I currently have about 3.1 million observations, with 100,214 unique IDs. Here is a small example of the data and my code:
ID = each person's unique ID
ID_M = ID of a person's mother
ID_AM = ID of a person's adopted mother
ID_F = ID of a person's father
ID_AF = ID of a person's adopted father
I am creating binary variables "parents_present," "f_children_present," "m_children_present," "af_children_present," "am_children_present" that indicate if
1) a person has parents present
2) a father has children present
3) a mother has children present
4) an adopted father has children present
5) an adopted mother has children present.
Is there are a way to do this more quickly? Any suggestions would be appreciated!
Best,
Cora
g parents_present = 0
levelsof ID, local(levels)
foreach l of local levels {
replace parents_present = 1 if inlist(`l',ID_F,ID_M,ID_AF,ID_AM)
}
g f_children_present = 0
levelsof ID_F, local(levels)
foreach l of local levels {
replace f_children_present = 1 if inlist(`l',ID)
}
g m_children_present = 0
levelsof ID_M, local(levels)
foreach l of local levels {
replace m_children_present = 1 if inlist(`l',ID)
}
g af_children_present = 0
levelsof ID_AF, local(levels)
foreach l of local levels {
replace af_children_present = 1 if inlist(`l',ID)
}
g am_children_present = 0
levelsof ID_AM, local(levels)
foreach l of local levels {
replace am_children_present = 1 if inlist(`l',ID)
}
************************************************** ************************************************** *****************
************************************************** ************************************************** *****************
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float ID int year float(ID_F ID_M ID_AF ID_AM) 4003 1968 4001 4002 . . 4003 1969 4001 4002 . . 4003 1970 4001 4002 . . 4003 1971 4001 4002 . . 4003 1972 4001 4002 . . 4003 1973 4001 4002 . . 4003 1974 4001 4002 . . 4003 1975 4001 4002 . . 4003 1976 4001 4002 . . 4003 1977 4001 4002 . . 4003 1978 4001 4002 . . 4003 1979 4001 4002 . . 4003 1980 4001 4002 . . 4003 1981 4001 4002 . . 4003 1982 4001 4002 . . 4003 1983 4001 4002 . . 4003 1984 4001 4002 . . 4003 1985 4001 4002 . . 4003 1986 4001 4002 . . 4003 1987 4001 4002 . . 4003 1988 4001 4002 . . 4003 1989 4001 4002 . . 4003 1990 4001 4002 . . 4003 1991 4001 4002 . . 4003 1992 4001 4002 . . 4003 1993 4001 4002 . . 4003 1994 4001 4002 . . 4003 1995 4001 4002 . . 4003 1996 4001 4002 . . 4003 1997 4001 4002 . . 4003 1999 4001 4002 . . 4005 1968 4001 4002 . . 4005 1969 4001 4002 . . 4005 1970 4001 4002 . . 4005 1971 4001 4002 . . 4005 1972 4001 4002 . . 4005 1973 4001 4002 . . 4005 1974 4001 4002 . . 4005 1975 4001 4002 . . 4005 1976 4001 4002 . . 4005 1977 4001 4002 . . 4005 1978 4001 4002 . . 4005 1979 4001 4002 . . 4005 1980 4001 4002 . . 4005 1981 4001 4002 . . 4005 1982 4001 4002 . . 4005 1983 4001 4002 . . 4005 1984 4001 4002 . . 4005 1985 4001 4002 . . 4005 1986 4001 4002 . . 4005 1987 4001 4002 . . 4005 1988 4001 4002 . . 4005 1989 4001 4002 . . 4005 1990 4001 4002 . . 4005 1991 4001 4002 . . 4005 1992 4001 4002 . . 4005 1993 4001 4002 . . 4005 1994 4001 4002 . . 4005 1995 4001 4002 . . 4005 1996 4001 4002 . . 4005 1997 4001 4002 . . 4005 1999 4001 4002 . . 4030 1968 . 4005 . . 4030 1969 . 4005 . . 4030 1970 . 4005 . . 4030 1971 . 4005 . . 4030 1972 . 4005 . . 4030 1973 . 4005 . . 4030 1974 . 4005 . . 4030 1975 . 4005 . . 4030 1976 . 4005 . . 4030 1977 . 4005 . . 4030 1978 . 4005 . . 4030 1979 . 4005 . . 4030 1980 . 4005 . . 4030 1981 . 4005 . . 4030 1982 . 4005 . . 4030 1983 . 4005 . . 4030 1984 . 4005 . . 4030 1985 . 4005 . . 4030 1986 . 4005 . . 4030 1987 . 4005 . . 4030 1988 . 4005 . . 4030 1989 . 4005 . . 4030 1990 . 4005 . . 4030 1991 . 4005 . . 4030 1992 . 4005 . . 4030 1993 . 4005 . . 4030 1994 . 4005 . . 4030 1995 . 4005 . . 4030 1996 . 4005 . . 4030 1997 . 4005 . . 4030 1999 . 4005 . . 4031 1968 4003 4173 . . 4031 1969 4003 4173 . . 4031 1970 4003 4173 . . 4031 1971 4003 4173 . . 4031 1972 4003 4173 . . 4031 1973 4003 4173 . . 4031 1974 4003 4173 . . end
Comment