Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • issues regressing two separate fixed effects models of panel data; one for just men and one for just women

    I am using panel data from the NLSY 1979 cohort. I am looking at the effects of marital status on BMI. I am having trouble using 'xtreg' with 'if'.
    Before I used the 'if' command to regress on 2 subsets, I regressed for the whole data set using this code.
    Code:
    xtset caseid_1979 year 
    xtreg lnBMI married nevermarried nolongermarried highestgradecompleted inschool employmentstatus income healthlimitation numkid pregnancy black hispanic white ageatint_ age2 urban_rural_ male female i.year, fe
    and I got these results:

    Code:
    Fixed-effects (within) regression               Number of obs     =      1,528
    Group variable: caseid_1979                     Number of groups  =      1,260
    
    R-squared:                                      Obs per group:
         Within  = 0.3333                                         min =          1
         Between = 0.0208                                         avg =        1.2
         Overall = 0.0259                                         max =          4
    
                                                    F(17,251)         =       7.38
    corr(u_i, Xb) = -0.1884                         Prob > F          =     0.0000
    
    ---------------------------------------------------------------------------------------
                    lnBMI | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
                  married |   -.013071   .0312928    -0.42   0.677    -.0747009     .048559
             nevermarried |  -.0023802   .0335475    -0.07   0.943    -.0684507    .0636903
          nolongermarried |          0  (omitted)
    highestgradecompleted |   .0101351   .0050233     2.02   0.045      .000242    .0200282
                 inschool |  -.0018558   .0116941    -0.16   0.874     -.024887    .0211753
         employmentstatus |   .0023946   .0090112     0.27   0.791    -.0153525    .0201417
                   income |   1.28e-07   2.96e-07     0.43   0.666    -4.56e-07    7.12e-07
         healthlimitation |  -.0154831   .0248235    -0.62   0.533     -.064372    .0334057
                   numkid |   .0435459   .0121799     3.58   0.000     .0195582    .0675337
                pregnancy |   .1017986   .0506415     2.01   0.045     .0020622    .2015349
                    black |          0  (omitted)
                 hispanic |          0  (omitted)
                    white |          0  (omitted)
                ageatint_ |  -.0581173   .0256915    -2.26   0.025    -.1087156    -.007519
                     age2 |   .0009748   .0003668     2.66   0.008     .0002524    .0016973
             urban_rural_ |   .0331079   .0196362     1.69   0.093    -.0055649    .0717807
                          |
                     year |
                    1985  |   .0607866   .0535251     1.14   0.257    -.0446291    .1662022
                    1986  |   .1129854   .0732124     1.54   0.124    -.0312036    .2571744
                    1988  |   .1312726   .1141291     1.15   0.251    -.0935001    .3560453
                    1990  |   .1568073   .1491202     1.05   0.294     -.136879    .4504935
                    1992  |   .0985041    .184461     0.53   0.594    -.2647844    .4617926
                          |
                    _cons |   3.657491     .41045     8.91   0.000     2.849126    4.465856
    ----------------------+----------------------------------------------------------------
                  sigma_u |  .17233168
                  sigma_e |  .07157705
                      rho |  .85287032   (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------------
    F test that all u_i=0: F(1259, 251) = 6.41                   Prob > F = 0.0000
    So my next step was to regress for females. I used this code:
    Code:
    xtset caseid_1979 year 
    xtreg lnBMI married nevermarried nolongermarried highestgradecompleted inschool employmentstatus income healthlimitation numkid pregnancy black hispanic white ageatint_ age2 urban_rural_ male female i.year if female==1, fe 
    ​​​​​​​
    which gave me the exact same results as the first regression:

    Code:
    Fixed-effects (within) regression               Number of obs     =      1,528
    Group variable: caseid_1979                     Number of groups  =      1,260
    
    R-squared:                                      Obs per group:
         Within  = 0.3333                                         min =          1
         Between = 0.0208                                         avg =        1.2
         Overall = 0.0259                                         max =          4
    
                                                    F(17,251)         =       7.38
    corr(u_i, Xb) = -0.1884                         Prob > F          =     0.0000
    
    ---------------------------------------------------------------------------------------
                    lnBMI | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
                  married |   -.013071   .0312928    -0.42   0.677    -.0747009     .048559
             nevermarried |  -.0023802   .0335475    -0.07   0.943    -.0684507    .0636903
          nolongermarried |          0  (omitted)
    highestgradecompleted |   .0101351   .0050233     2.02   0.045      .000242    .0200282
                 inschool |  -.0018558   .0116941    -0.16   0.874     -.024887    .0211753
         employmentstatus |   .0023946   .0090112     0.27   0.791    -.0153525    .0201417
                   income |   1.28e-07   2.96e-07     0.43   0.666    -4.56e-07    7.12e-07
         healthlimitation |  -.0154831   .0248235    -0.62   0.533     -.064372    .0334057
                   numkid |   .0435459   .0121799     3.58   0.000     .0195582    .0675337
                pregnancy |   .1017986   .0506415     2.01   0.045     .0020622    .2015349
                    black |          0  (omitted)
                 hispanic |          0  (omitted)
                    white |          0  (omitted)
                ageatint_ |  -.0581173   .0256915    -2.26   0.025    -.1087156    -.007519
                     age2 |   .0009748   .0003668     2.66   0.008     .0002524    .0016973
             urban_rural_ |   .0331079   .0196362     1.69   0.093    -.0055649    .0717807
                     male |          0  (omitted)
                   female |          0  (omitted)
                          |
                     year |
                    1985  |   .0607866   .0535251     1.14   0.257    -.0446291    .1662022
                    1986  |   .1129854   .0732124     1.54   0.124    -.0312036    .2571744
                    1988  |   .1312726   .1141291     1.15   0.251    -.0935001    .3560453
                    1990  |   .1568073   .1491202     1.05   0.294     -.136879    .4504935
                    1992  |   .0985041    .184461     0.53   0.594    -.2647844    .4617926
                          |
                    _cons |   3.657491     .41045     8.91   0.000     2.849126    4.465856
    ----------------------+----------------------------------------------------------------
                  sigma_u |  .17233168
                  sigma_e |  .07157705
                      rho |  .85287032   (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------------
    F test that all u_i=0: F(1259, 251) = 6.41                   Prob > F = 0.0000
    
    ​​​​​​​
    also, before I noticed the issue of them having the same results, I had an issue regressing it for males. I used:

    Code:
    xtset caseid_1979 year 
    xtreg lnBMI married nevermarried nolongermarried highestgradecompleted inschool employmentstatus income healthlimitation numkid pregnancy black hispanic white ageatint_ age2 urban_rural_ female male i.year if male==1, fe
    and have received the error 'insufficient observations'
    I am not sure where my mistake is to get the same regression results when I Have an 'if' command.​​​​​​​



  • #2
    it seems that your sample contains only female respondents

    Comment


    • #3
      Sadie:
      your -if- clause in the second code is wrong. You cannot have both -female- and -male- coded 1 in a categorical variable.
      In addition, why running two separate -xtreg,fe- instead of adding gender as a predictor in the right-hand side of your regression equation?
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        So, I only included the variable 'male' in regression and I completely dropped female just in case. Using this code:
        Code:
        xtreg lnBMI married nevermarried nolongermarried highestgradecompleted inschool employmentstatus income healthlimitation numkid pregnancy black hispanic white ageatint_ age2 urban_rural_ i.male#i.year, fe
        still gives me the exact same results as those I was getting before.

        Comment


        • #5
          Sadie Belechak Pay attention to Oyvind Snilsberg 's response in #2. There is considerable evidence that you have no usable data on males:

          1. In your first regression, both male and female variables are included in your regression command, but neither of them shows up in the output. You would expect Stata to omit one of them because male and female would be colinear (male + female = 1, always). But if the other one is also omitted, then it must be due either to colinearity with something else in the model, or there are no observations.

          2. Your second regression with -if female == 1- gives the same exact sample size as the full regression. That means that there were no male observations to omit.

          3. Your third regression with -if male == 1- returns a "no observations" error message.

          Stata is screaming at you that you have no male data. I have never known Stata to be wrong about such things. Now, you may think you have lots of male observations in you data set, and that may well be the case. But you have none that are usable in the regression. Remember that an observation will only be used if there are no missing values for any variable mentioned in the regression command. It seems likely that all of the male observations in your data set have a missing value for at least one of the other variables in the regression. So you can verify what's going on with:

          Code:
          mark usable
          markout usable lnBMI married nevermarried nolongermarried highestgradecompleted inschool ///
              employmentstatus income healthlimitation numkid pregnancy black hispanic white ageatint_ ///
              age2 urban_rural_ year
          count if male == 1 & usable == 1
          So your data set is not what you think it is, and you need to fix it.
          Last edited by Clyde Schechter; 19 Mar 2022, 13:01.

          Comment

          Working...
          X