Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • OLS/Random Effects/Mundlak Interaction term

    Hi, I am currently using a panel data-set to identify ethnic pay differentials. My dependent variable is log monthly income and my explanatory variables include race and number of dependent children, amongst other control factors. I have completed my regression using OLS, random effects and the Mundlak approach.

    I am also attempting to look at the potential additional effect of increasing the number of dependent children by ethnicity. Therefore, I have estimated my regression and used the following command to estimate the effect of my interaction term which is race*dependent_children:
    margin race ,dydx(dependentchildren)

    I wanted to see if this was the best method to use to estimate this and how to interpret the dy/dx below? In my regression analysis the base group was 'british/english/scottish/welsh/northern irish (white)', which was therefore excluded from the regressions, therefore do my results need to be interpreted relative to the base group still?

    Also, the co-efficients in my regression analysis on each race category and dependent children were both negative, where this has generated positive results. Could there be a potential reason for this?

    Table:
    dy/dx w.r.t dependentchildren dy/dx Std. Err t P>|t|
    95% Conf. Interval]
    race
    british/english/scottish/welsh/northern irish (white) .0668368 .004582 14.59 0.000 .0578556 .075818
    indian (asian or asian british) .0315714 .012957 2.44 0.015 .0061745 .0569683
    pakistani (asian or asian british) .0367387 .0134726 2.73 0.006 .0103312 .0631462
    bangladeshi (asian or asian british) .0325315 .0142156 2.29 0.022 .0046676 .0603954
    any other ethnic group (other ethnic group) .0470928 .0127087 3.71 0.000 .0221825 .0720031



    Thanks

  • #2
    Originally posted by Guest View Post
    I have completed my regression using OLS, random effects and the Mundlak approach.
    Stating this in terms of Stata commands used makes it easier for others to advice. I assume that by OLS you mean a pooled linear regression model, implemented in regress in Stata (note that OLS is an estimator, not a model; for example, xtreg , fe is implemented by running OLS on transformed data).

    I do not know how you estimated the "Mundlak" model; I assume you have included mean values of your predictors at the panel level. Note that an interaction term in such models requires including the respective mean of the lower-order terms and the mean of the interaction, i.e., the multiplied variables. Therefore, you cannot simply use factor variable notation which margins requires to work correctly. Even without an interaction term, I would not count on margins to work correctly with the Mundlak approach.


    Originally posted by Guest View Post
    Also, the co-efficients in my regression analysis on each race category and dependent children were both negative, where this has generated positive results. Could there be a potential reason for this?
    Better to show your regression results (and command), too. Use code delimiters to do this.

    Best
    Daniel
    Last edited by sladmin; 14 May 2019, 07:44. Reason: anonymize original poster

    Comment


    • #3
      The following shows the command used for the OLS regression for males:

      regress lnincome i.race age age_squared i.region i.highestqualification i.fulltime_parttime i.occupation dependentchildren i.marital i.religion race##c.dependentchildren i.wave, cluster(pidp), if sex==1

      margin race ,dydx(dependentchildren)

      Comment


      • #4
        Thanks for providing code.

        Concerning syntax, it is probably not a good idea to have age_squared; better use c.age##c.age.

        Turning to your questions

        I wanted to see[...] how to interpret the dy/dx below? In my regression analysis the base group was 'british/english/scottish/welsh/northern irish (white)', which was therefore excluded from the regressions, therefore do my results need to be interpreted relative to the base group still?
        In your margins command, the dy/dx estimate for your base/reference group is exactly the same that as the (conditional) main effect in your regression table. Here is an example, using the auto dataset.

        Code:
        sysuse auto
        regress price i.rep78##c.mpg
        margins rep78 , dydx(mpg)
        which produces

        Code:
        . sysuse auto
        (1978 Automobile Data)
        
        . regress price i.rep78##c.mpg
        
              Source |       SS           df       MS      Number of obs   =        69
        -------------+----------------------------------   F(9, 59)        =      3.65
               Model |   206362465         9  22929162.7   Prob > F        =    0.0011
            Residual |   370434494        59  6278550.75   R-squared       =    0.3578
        -------------+----------------------------------   Adj R-squared   =    0.2598
               Total |   576796959        68  8482308.22   Root MSE        =    2505.7
        
        ------------------------------------------------------------------------------
               price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
               rep78 |
                  2  |   10881.12   13452.68     0.81   0.422    -16037.64    37799.87
                  3  |   8973.281   12725.58     0.71   0.483    -16490.55    34437.11
                  4  |   651.7399    12823.1     0.05   0.960    -25007.23    26310.71
                  5  |   4363.191   12794.52     0.34   0.734    -21238.58    29964.96
                     |
                 mpg |  -123.1667      590.6    -0.21   0.836    -1304.955    1058.621
                     |
         rep78#c.mpg |
                  2  |  -507.6563   642.1123    -0.79   0.432     -1792.52    777.2075
                  3  |  -375.7209   601.1921    -0.62   0.534    -1578.704    827.2618
                  4  |   43.26329   603.3025     0.07   0.943    -1163.942    1250.469
                  5  |  -81.52802     597.53    -0.14   0.892    -1277.183    1114.127
                     |
               _cons |       7151   12528.52     0.57   0.570    -17918.51    32220.51
        ------------------------------------------------------------------------------
        
        . margins rep78 , dydx(mpg)
        
        Average marginal effects                        Number of obs     =         69
        Model VCE    : OLS
        
        Expression   : Linear prediction, predict()
        dy/dx w.r.t. : mpg
        
        ------------------------------------------------------------------------------
                     |            Delta-method
                     |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        mpg          |
               rep78 |
                  1  |  -123.1667      590.6    -0.21   0.836    -1304.955    1058.621
                  2  |   -630.823   251.9918    -2.50   0.015    -1135.057   -126.5885
                  3  |  -498.8875   112.3547    -4.44   0.000    -723.7088   -274.0662
                  4  |  -79.90338   123.1486    -0.65   0.519    -326.3232    166.5164
                  5  |  -204.6947   90.73959    -2.26   0.028    -386.2642   -23.12517
        ------------------------------------------------------------------------------
        Notice that the (conditional) main effect for mpg in the regression table is exactly the same as the one reported by margins for the reference group, i.e., rep78==1. The remaining dy/dx can be estimated from the regression output as follows

        Code:
        display _b[mpg]+_b[2.rep78#mpg]
        display _b[mpg]+_b[3.rep78#mpg]
        display _b[mpg]+_b[4.rep78#mpg]
        display _b[mpg]+_b[5.rep78#mpg]
        which gives

        Code:
        . display _b[mpg]+_b[2.rep78#mpg]
        -630.82301
        
        . display _b[mpg]+_b[3.rep78#mpg]
        -498.88754
        
        . display _b[mpg]+_b[4.rep78#mpg]
        -79.903382
        
        . display _b[mpg]+_b[5.rep78#mpg]
        -204.69468
        So, the results that you get from margins are no longer interpreted as differences to the reference group.

        Whether this a good approach to answering your question about

        [...] the potential additional effect of increasing the number of dependent children by ethnicity.
        I cannot tell for sure because I do not fully understand that question. If you want to look at all differences in the effect of children among ethnic groups, add option pwcompare to your margins command. If you want something else, please try and rephrase your question.

        Best
        Daniel

        Comment


        • #5
          Thank you

          My question is to understand the effect of children among ethnic groups. Therefore, by creating an interaction term between race and number of dependent children.

          How would I add pwcompare to: margin race ,dydx(dependentchildren)

          Also, the results of my regression table are shown below:


          lnincome Coef. St.Err. t-value p-value [95% Conf Interval] Sig
          White* 0.000 . . . . .
          Indian -0.183 0.019 -9.79 0.000 -0.219 -0.146 ***
          Pakistani -0.340 0.024 -14.10 0.000 -0.387 -0.292 ***
          Bangladeshi -0.327 0.031 -10.73 0.000 -0.387 -0.268 ***
          Other -0.206 0.017 -12.20 0.000 -0.239 -0.173 ***
          Age 0.087 0.003 34.29 0.000 0.082 0.092 ***
          Age Squared -0.001 0.000 -30.91 0.000 -0.001 -0.001 ***
          North* 0.000 . . . . .
          Midlands 0.012 0.014 0.85 0.394 -0.016 0.040
          East 0.077 0.018 4.39 0.000 0.043 0.111 ***
          London & South 0.109 0.012 9.09 0.000 0.086 0.133 ***
          W, S & NI 0.023 0.013 1.71 0.088 -0.003 0.049 *
          Higher Degree* 0.000 . . . . .
          1st Degree -0.129 0.015 -8.41 0.000 -0.159 -0.099 ***
          A-Level -0.213 0.019 -10.96 0.000 -0.251 -0.175 ***
          Vocational -0.264 0.024 -10.93 0.000 -0.311 -0.216 ***
          GCSE -0.292 0.016 -18.04 0.000 -0.323 -0.260 ***
          >GCSE -0.362 0.022 -16.80 0.000 -0.404 -0.319 ***
          None -0.339 0.018 -19.35 0.000 -0.373 -0.304 ***
          Full-time 0.000 . . . . .
          Part-time -0.960 0.015 -63.15 0.000 -0.989 -0.930 ***
          Professional 0.000 . . . . .
          Intermediate -0.459 0.014 -32.57 0.000 -0.486 -0.431 ***
          Routine -0.356 0.010 -35.06 0.000 -0.375 -0.336 ***
          Dependent - children -0.003 0.005 -0.54 0.590 -0.012 0.007
          Not married* 0.000 . . . . .
          Married 0.103 0.011 9.30 0.000 0.081 0.125 ***
          Wave 1* 0.000 . . . . .
          Wave 2 0.000 0.008 -0.01 0.991 -0.016 0.016
          Wave 3 0.028 0.008 3.32 0.001 0.012 0.045 ***
          Wave 4 0.018 0.009 2.05 0.041 0.001 0.036 **
          Wave 5 0.037 0.009 4.11 0.000 0.019 0.055 ***
          Wave 6 0.075 0.009 8.14 0.000 0.057 0.094 ***
          Wave 7 0.082 0.010 8.49 0.000 0.063 0.100 ***
          Constant 6.086 0.051 120.47 0.000 5.987 6.185 ***
          Mean dependent var 7.445 SD dependent var 0.899
          R-squared 0.430 Number of obs 61378.000
          F-test 727.880 Prob > F 0.000
          Akaike crit. (AIC) 126712.493 Bayesian crit. (BIC) 126965.187
          *** p<0.01, ** p<0.05, * p<0.1

          Comment


          • #6
            The regression table above is unclear so I have attached this here:
            Attached Files

            Comment


            • #7
              Originally posted by Guest View Post
              My question is to understand the effect of children among ethnic groups
              Then your margins command might indeed be suited well as it is.

              Originally posted by Guest View Post
              How would I add pwcompare to: margin race ,dydx(dependentchildren)
              Just add the option

              Code:
              margins race , dydx(dependentchildren) pwcompare
              Best
              Daniel

              Comment


              • #8
                Do findit from the command line for mundlak and xthybrid. These papers are quite relevant:

                Schunck R. 2013. Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models. Stata Journal 13(1): 65-76.
                Schunck R, Perales F. 2017. Within- and between-cluster effects in generalized linear mixed models: A discussion of approaches and the xthybrid command. Stata Journal 17(1): 89115.

                Comment


                • #9
                  Using the following command, I have produced the attached table.

                  regress lnincome i.race age age_squared i.region i.highestqualification i.fulltime_parttime i.occupation i.marital i.religion race##i.dependentchildren i.wave, cluster(pidp), if sex==1
                  margins race , dydx(dependentchildren) pwcompare.

                  I wanted to check with the following interpretation of the results:

                  So, in the following table. I have used a categorical variable of dependent children and have only shown results for ‘1.dependent children’ which shows the effect of increasing the number of dependent children from 0 to 1 (as 0 dependent children is the base group here).

                  When attempting to interpret the highlighted co-efficients.For example -0.034, does this mean that the effect of increasing the number of dependent children from 0 to 1 on earnings for the Indian group leads to a 3.4% decrease in earnings relative to the impact on the British group of increasing the number of dependent children from 0 to 1.

                  And for -0.149, the effect of increasing the number of dependent children from 0 to 1 on earnings for the Pakistani group leads to a 14.9% decrease in earnings relative to the impact on the British group of increasing the number of dependent children from 0 to 1.


                  *Race is split into 5 categories: British, Indian, Pakistani, Bangladeshi and Other


                  Thanks


                  Attached Files

                  Comment

                  Working...
                  X