Dear all,
I want to investigate which child in a sibling group takes over the care of a parent. The focus is on gender (of both the caregiving child and the siblings), but other characteristics of the children and siblings (e.g., education, employment status, own children, spatial proximity to the parent) will also be examined as influencing factors.
The dataset contains parents with all respective children and their characteristics. Following Grigoryeva (2017), I would like to first conduct an individual-level analysis (what factors influence a child's adoption of parental care (separated into sons and daughters)) and, in a second step, a sibling-level analysis (do sibling characteristics (characteristics of brothers, characteristics of sisters) influence the care time of brothers and sisters, respectively (as respective gender sibling groups)?).
For the sibling-level analysis, summary statistics of the individual-level independent variables were designed separately for sons and daughters, i.e., the number of children with the characteristic of interest was summed up (for dichotomous variables) or the mean thereof was calculated (for continuous variables).
The dependent variables are the total days of care (absolute measure) of all sons in a sibling group and the total days of care (absolute measure) of all daughters in a sibling group. In addition, a standardized proportion (relative measure) was created from each by setting the care time of sons in proportion to the care time of all siblings, and this in turn in proportion to the proportion of sons in a sibling group (the same for daughters). (As an example: If a sibling group of three sons and one daughter jointly provide 120 days of care, and the daughters provide 100 days of that care, then the daughters provide a share of 100/120. The share they would have provided under equal division of caring labour would be 3/4. The standardized proportion of daughters' care time is thus (100/120)/(3/4) and equals 1.11, meaning that daughters jointly provide about 11 percent more care than they would have provided if the division of care work in the sibship had been equal.)
The data set with individual- and sibling-level variables looks like the following (shortened in variables and observations) example. Individual-level variables are highlighted in blue, sibship-level variables are highlighted in red. Variables contained are:
child_helpfreq_abs - absolute care time of child (days per year)
child_helpfreq_rel - standardized relative care time of child (calculation: (child_helpfreq_abs/child_helpfreq_all)/(1/nr of children in sibship)) (same principle as on sibship level (see explanation above))
child_helpfreq_all - total care time of all children in a sibship (auxiliary variable for calcuation of standardized shares)
child_age - age of child
child_faraway - child lives far away from parent (yes/no)
nr_sons/nr_daught - number of sons/daughters in the sibship
sons_helpfreq_abs / daught_helpfreq_abs - total care time (days per year) of all sons/daughters in a sibship
sons_helpfreq_rel / daught_helpfreq_rel - standardized share of care time of all sons/daughters in a sibship (see explanation above)
sons_meanage / daught_meanage - mean age of sons/daughters in a sibship
sons_faraway / daught_faraway - number of sons/daughters in a sibship living far away from the parent
Since sibship-level values are logically the same within a sibling group, I am now a little bit confused which observations to include in my analyses. The regression commands on the sibship level would look like this:
Basically, I have three thoughts:
1) Take one child per sibling group and use this sample to calculate both sons' and daughters' care time. (Thought behind: Because all sibship-level observations within a sibling group are the same, using only one is enough.)
2) Use the observations of sons for the calculation of sons' care time, and the observations of daughters for the calculation of daughters' care time, i.e.:
Thought behind: Some kind of weighting with the number of sons and daughters, respectively, in each estimation (?).
3) Use all observations, both for the calculation of sons' and daughters' care time. Would this be a kind of weighting (larger sibling groups are included in the analysis with more (identical) observations)? Do I have to cluster the errors due to the fact that observations within a sibship are more similar than between sibships?
I'm just not sure how to do it right and I'm really confused now. As long as I understand, Grigoryeva (2017) used for the calculation of sons' care time all observations from sibships with at least one son (meaning that male single children, male-only sibships and mixed-gender sibships are contained), and for the calculation of daughters' care time all observations from sibships with at least one daughter. With this, she has different sample sizes for both estimations. I honestly don't get how she did this, because at least mean values in the independent variables of sons or daughters, respectively, could not be calculated for female or male single children and for samesex sibships (meaning that if there is no sister or no brother, a calucation of the mean age of sisters or brothers in the sibship cannot be done, resulting in a missing value). As a result, only mixed-gender sibships can be used for analyses at the sibship level. That being said, I still don't know what the correct approach for the estimation is ...
Sorry for the long text and thank you for reading it. Any thought, hint, and/or literature reference is welcome! Thanks!
I want to investigate which child in a sibling group takes over the care of a parent. The focus is on gender (of both the caregiving child and the siblings), but other characteristics of the children and siblings (e.g., education, employment status, own children, spatial proximity to the parent) will also be examined as influencing factors.
The dataset contains parents with all respective children and their characteristics. Following Grigoryeva (2017), I would like to first conduct an individual-level analysis (what factors influence a child's adoption of parental care (separated into sons and daughters)) and, in a second step, a sibling-level analysis (do sibling characteristics (characteristics of brothers, characteristics of sisters) influence the care time of brothers and sisters, respectively (as respective gender sibling groups)?).
For the sibling-level analysis, summary statistics of the individual-level independent variables were designed separately for sons and daughters, i.e., the number of children with the characteristic of interest was summed up (for dichotomous variables) or the mean thereof was calculated (for continuous variables).
The dependent variables are the total days of care (absolute measure) of all sons in a sibling group and the total days of care (absolute measure) of all daughters in a sibling group. In addition, a standardized proportion (relative measure) was created from each by setting the care time of sons in proportion to the care time of all siblings, and this in turn in proportion to the proportion of sons in a sibling group (the same for daughters). (As an example: If a sibling group of three sons and one daughter jointly provide 120 days of care, and the daughters provide 100 days of that care, then the daughters provide a share of 100/120. The share they would have provided under equal division of caring labour would be 3/4. The standardized proportion of daughters' care time is thus (100/120)/(3/4) and equals 1.11, meaning that daughters jointly provide about 11 percent more care than they would have provided if the division of care work in the sibship had been equal.)
The data set with individual- and sibling-level variables looks like the following (shortened in variables and observations) example. Individual-level variables are highlighted in blue, sibship-level variables are highlighted in red. Variables contained are:
child_helpfreq_abs - absolute care time of child (days per year)
child_helpfreq_rel - standardized relative care time of child (calculation: (child_helpfreq_abs/child_helpfreq_all)/(1/nr of children in sibship)) (same principle as on sibship level (see explanation above))
child_helpfreq_all - total care time of all children in a sibship (auxiliary variable for calcuation of standardized shares)
child_age - age of child
child_faraway - child lives far away from parent (yes/no)
nr_sons/nr_daught - number of sons/daughters in the sibship
sons_helpfreq_abs / daught_helpfreq_abs - total care time (days per year) of all sons/daughters in a sibship
sons_helpfreq_rel / daught_helpfreq_rel - standardized share of care time of all sons/daughters in a sibship (see explanation above)
sons_meanage / daught_meanage - mean age of sons/daughters in a sibship
sons_faraway / daught_faraway - number of sons/daughters in a sibship living far away from the parent
parentid | childid | childsex | child_helpfreq_abs | child_helpfreq_rel | child_helpfreq_all | child_age | child_faraway | nr_sons | nr_daught | sons_helpfreq_abs | sons_helpfreq_rel | daught_helpfreq_abs | daught_helpfreq_rel | sons_meanage | sons_faraway | daught_meanage | daught_faraway |
1 | 1-1 | female | 52 | 1,8 | 58 | 37 | yes | 1 | 1 | 6 | 0,2 | 52 | 1,8 | 39 | 0 | 37 | 1 |
1 | 1-2 | male | 6 | 0,2 | 58 | 39 | no | 1 | 1 | 6 | 0,2 | 52 | 1,8 | 39 | 0 | 37 | 1 |
2 | 2-1 | female | 0 | 0 | 0 | 28 | no | 0 | 1 | 0 | . | 0 | 0 | . | 0 | 28 | 0 |
3 | 3-1 | male | 12 | 0,08 | 429 | 25 | yes | 2 | 1 | 64 | 0,2 | 365 | 2,6 | 26,5 | 1 | 30 | 0 |
3 | 3-2 | female | 365 | 2,55 | 429 | 30 | no | 2 | 1 | 64 | 0,2 | 365 | 2,6 | 26,5 | 1 | 30 | 0 |
3 | 3-3 | male | 52 | 0,36 | 429 | 28 | no | 2 | 1 | 64 | 0,2 | 365 | 2,6 | 26,5 | 1 | 30 | 0 |
4 | 4-1 | male | 0 | 0 | 116 | 40 | yes | 2 | 2 | 52 | 0,9 | 64 | 1,1 | 28,5 | 1 | 45,5 | 1 |
4 | 4-2 | male | 52 | 1,8 | 116 | 17 | no | 2 | 2 | 52 | 0,9 | 64 | 1,1 | 28,5 | 1 | 45,5 | 1 |
4 | 4-3 | female | 52 | 1,8 | 116 | 48 | yes | 2 | 2 | 52 | 0,9 | 64 | 1,1 | 28,5 | 1 | 45,5 | 1 |
4 | 4-4 | female | 12 | 0,4 | 116 | 43 | no | 2 | 2 | 52 | 0,9 | 64 | 1,1 | 28,5 | 1 | 45,5 | 1 |
5 | 5-1 | male | 52 | 1 | 52 | 22 | no | 1 | 0 | 52 | 1 | 0 | . | 52 | 0 | . | 0 |
6 | 6-1 | female | 12 | 1,33 | 18 | 32 | no | 0 | 2 | 0 | . | 18 | 1 | . | 0 | 33,5 | 0 |
6 | 6-2 | female | 6 | 0,66 | 18 | 35 | no | 0 | 2 | 0 | . | 18 | 1 | . | 0 | 33,5 | 0 |
7 | 7-1 | male | 12 | 0,51 | 70 | 47 | no | 1 | 2 | 12 | 0,5 | 58 | 1,2 | 47 | 0 | 46,5 | 2 |
7 | 7-2 | female | 6 | 0,26 | 70 | 48 | yes | 1 | 2 | 12 | 0,5 | 58 | 1,2 | 47 | 0 | 46,5 | 2 |
7 | 7-3 | female | 52 | 2,23 | 70 | 45 | yes | 1 | 2 | 12 | 0,5 | 58 | 1,2 | 47 | 0 | 46,5 | 2 |
Code:
reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught
1) Take one child per sibling group and use this sample to calculate both sons' and daughters' care time. (Thought behind: Because all sibship-level observations within a sibling group are the same, using only one is enough.)
2) Use the observations of sons for the calculation of sons' care time, and the observations of daughters for the calculation of daughters' care time, i.e.:
Code:
reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==male reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==female
3) Use all observations, both for the calculation of sons' and daughters' care time. Would this be a kind of weighting (larger sibling groups are included in the analysis with more (identical) observations)? Do I have to cluster the errors due to the fact that observations within a sibship are more similar than between sibships?
I'm just not sure how to do it right and I'm really confused now. As long as I understand, Grigoryeva (2017) used for the calculation of sons' care time all observations from sibships with at least one son (meaning that male single children, male-only sibships and mixed-gender sibships are contained), and for the calculation of daughters' care time all observations from sibships with at least one daughter. With this, she has different sample sizes for both estimations. I honestly don't get how she did this, because at least mean values in the independent variables of sons or daughters, respectively, could not be calculated for female or male single children and for samesex sibships (meaning that if there is no sister or no brother, a calucation of the mean age of sisters or brothers in the sibship cannot be done, resulting in a missing value). As a result, only mixed-gender sibships can be used for analyses at the sibship level. That being said, I still don't know what the correct approach for the estimation is ...
Sorry for the long text and thank you for reading it. Any thought, hint, and/or literature reference is welcome! Thanks!
Comment