Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression on individual and sibship level - what observations to include?

    Dear all,

    I want to investigate which child in a sibling group takes over the care of a parent. The focus is on gender (of both the caregiving child and the siblings), but other characteristics of the children and siblings (e.g., education, employment status, own children, spatial proximity to the parent) will also be examined as influencing factors.

    The dataset contains parents with all respective children and their characteristics. Following Grigoryeva (2017), I would like to first conduct an individual-level analysis (what factors influence a child's adoption of parental care (separated into sons and daughters)) and, in a second step, a sibling-level analysis (do sibling characteristics (characteristics of brothers, characteristics of sisters) influence the care time of brothers and sisters, respectively (as respective gender sibling groups)?).

    For the sibling-level analysis, summary statistics of the individual-level independent variables were designed separately for sons and daughters, i.e., the number of children with the characteristic of interest was summed up (for dichotomous variables) or the mean thereof was calculated (for continuous variables).
    The dependent variables are the total days of care (absolute measure) of all sons in a sibling group and the total days of care (absolute measure) of all daughters in a sibling group. In addition, a standardized proportion (relative measure) was created from each by setting the care time of sons in proportion to the care time of all siblings, and this in turn in proportion to the proportion of sons in a sibling group (the same for daughters). (As an example: If a sibling group of three sons and one daughter jointly provide 120 days of care, and the daughters provide 100 days of that care, then the daughters provide a share of 100/120. The share they would have provided under equal division of caring labour would be 3/4. The standardized proportion of daughters' care time is thus (100/120)/(3/4) and equals 1.11, meaning that daughters jointly provide about 11 percent more care than they would have provided if the division of care work in the sibship had been equal.)

    The data set with individual- and sibling-level variables looks like the following (shortened in variables and observations) example. Individual-level variables are highlighted in blue, sibship-level variables are highlighted in red. Variables contained are:
    child_helpfreq_abs - absolute care time of child (days per year)
    child_helpfreq_rel - standardized relative care time of child (calculation: (child_helpfreq_abs/child_helpfreq_all)/(1/nr of children in sibship)) (same principle as on sibship level (see explanation above))
    child_helpfreq_all - total care time of all children in a sibship (auxiliary variable for calcuation of standardized shares)
    child_age - age of child
    child_faraway - child lives far away from parent (yes/no)
    nr_sons/nr_daught - number of sons/daughters in the sibship
    sons_helpfreq_abs / daught_helpfreq_abs - total care time (days per year) of all sons/daughters in a sibship
    sons_helpfreq_rel / daught_helpfreq_rel - standardized share of care time of all sons/daughters in a sibship (see explanation above)
    sons_meanage / daught_meanage - mean age of sons/daughters in a sibship
    sons_faraway / daught_faraway - number of sons/daughters in a sibship living far away from the parent
    parentid childid childsex child_helpfreq_abs child_helpfreq_rel child_helpfreq_all child_age child_faraway nr_sons nr_daught sons_helpfreq_abs sons_helpfreq_rel daught_helpfreq_abs daught_helpfreq_rel sons_meanage sons_faraway daught_meanage daught_faraway
    1 1-1 female 52 1,8 58 37 yes 1 1 6 0,2 52 1,8 39 0 37 1
    1 1-2 male 6 0,2 58 39 no 1 1 6 0,2 52 1,8 39 0 37 1
    2 2-1 female 0 0 0 28 no 0 1 0 . 0 0 . 0 28 0
    3 3-1 male 12 0,08 429 25 yes 2 1 64 0,2 365 2,6 26,5 1 30 0
    3 3-2 female 365 2,55 429 30 no 2 1 64 0,2 365 2,6 26,5 1 30 0
    3 3-3 male 52 0,36 429 28 no 2 1 64 0,2 365 2,6 26,5 1 30 0
    4 4-1 male 0 0 116 40 yes 2 2 52 0,9 64 1,1 28,5 1 45,5 1
    4 4-2 male 52 1,8 116 17 no 2 2 52 0,9 64 1,1 28,5 1 45,5 1
    4 4-3 female 52 1,8 116 48 yes 2 2 52 0,9 64 1,1 28,5 1 45,5 1
    4 4-4 female 12 0,4 116 43 no 2 2 52 0,9 64 1,1 28,5 1 45,5 1
    5 5-1 male 52 1 52 22 no 1 0 52 1 0 . 52 0 . 0
    6 6-1 female 12 1,33 18 32 no 0 2 0 . 18 1 . 0 33,5 0
    6 6-2 female 6 0,66 18 35 no 0 2 0 . 18 1 . 0 33,5 0
    7 7-1 male 12 0,51 70 47 no 1 2 12 0,5 58 1,2 47 0 46,5 2
    7 7-2 female 6 0,26 70 48 yes 1 2 12 0,5 58 1,2 47 0 46,5 2
    7 7-3 female 52 2,23 70 45 yes 1 2 12 0,5 58 1,2 47 0 46,5 2
    Since sibship-level values are logically the same within a sibling group, I am now a little bit confused which observations to include in my analyses. The regression commands on the sibship level would look like this:
    Code:
    reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught
    
    reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught
    Basically, I have three thoughts:

    1) Take one child per sibling group and use this sample to calculate both sons' and daughters' care time. (Thought behind: Because all sibship-level observations within a sibling group are the same, using only one is enough.)

    2) Use the observations of sons for the calculation of sons' care time, and the observations of daughters for the calculation of daughters' care time, i.e.:
    Code:
    reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==male
    
    reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==female
    Thought behind: Some kind of weighting with the number of sons and daughters, respectively, in each estimation (?).

    3) Use all observations, both for the calculation of sons' and daughters' care time. Would this be a kind of weighting (larger sibling groups are included in the analysis with more (identical) observations)? Do I have to cluster the errors due to the fact that observations within a sibship are more similar than between sibships?

    I'm just not sure how to do it right and I'm really confused now. As long as I understand, Grigoryeva (2017) used for the calculation of sons' care time all observations from sibships with at least one son (meaning that male single children, male-only sibships and mixed-gender sibships are contained), and for the calculation of daughters' care time all observations from sibships with at least one daughter. With this, she has different sample sizes for both estimations. I honestly don't get how she did this, because at least mean values in the independent variables of sons or daughters, respectively, could not be calculated for female or male single children and for samesex sibships (meaning that if there is no sister or no brother, a calucation of the mean age of sisters or brothers in the sibship cannot be done, resulting in a missing value). As a result, only mixed-gender sibships can be used for analyses at the sibship level. That being said, I still don't know what the correct approach for the estimation is ...

    Sorry for the long text and thank you for reading it. Any thought, hint, and/or literature reference is welcome! Thanks!

  • #2
    May I push my question up again? Maybe the solution is very simple, a hint would be helpful. If any further information is needed, please tell me.Thank you very much!

    Comment


    • #3
      Ariane:
      too long queries are challenging to read and are often skipped even by potentially interested listers.
      Bumping (which is also discouraged by the FAQ) is not helpful; a more coincise post is possibly the way to go. Thanks.
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Thanks for the hint, Carlo. I'm really sorry for my detailed description, I thought it would be easier to understand what my concern is if I make it more than clear. But I'll try a short version of it:

        I want to investigate the influence of brothers' and sisters' characteristics on the total caretime of brothers and sisters, respectively, in a sibling group. The regression commands would look like these examples (shortened in variables):
        Code:
        reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught
        
        reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught
        Explanatory variables are summary statistics of the respective characteristic, meaning that sons_faraway is the number of sons within a sibling group living far away from the parent (likewise for all dichotomous variables) and sons_meanage is the mean age of all sons within a sibling group (likewise for all continuos variables).

        Since values on these variables are logically the same within a sibling group and I do have all siblings of a sibship in the dataset, I am not sure which observations to include in my analyses. Do I only take one observation per sibship, because values are equal for all siblings? Do I have to take all observations so that a kind of weighting is in the analysis (larger sibling groups are included in the analysis with more (identical) observations? Do I have to separate between sons and daughters and use the observations of sons for the calculation of sons' care time, and the observations of daughters for the calculation of daughters' care time, i.e.:
        Code:
        reg sons_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==male
        
        reg daught_helpfreq_abs sons_meanage sons_faraway daught_meanage daught_faraway nr_sons nr_daught if childsex==female
        I am a little confused due to the mixture of sibship level measures and individual level observations, and not sure how to do it right. It would be great if anybody has an idea to share! Thanks!

        Comment

        Working...
        X