Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing equality of coefficients from two identical instrumented regressions (ivreg2) estimated on different samples

    Dear Statalist:

    I am trying to make a comparison similar to that in Columns 1 and 2 of Table 6 in the paper http://www.ericzwick.com/stimulus/stimulus.pdf, which I have reproduced here:

    Click image for larger version

Name:	zwicktab6.png
Views:	1
Size:	181.7 KB
ID:	1482786


    I am running the separate models using the code:

    Code:
    xi: ivreg2 outcome (treatment = instrument) control1 control2 i.year i.province if small == 1 [pw=triangle], cluster(clusterid)
    estimates store ModelA
    xi: ivreg2 outcome (treatment = instrument) control1 control2 i.year i.province if small == 0 [pw=triangle], cluster(clusterid)
    estimates store ModelB
    outcome is continuous, treatment is binary, the instrument is binary and small = 1 for the bottom 3 deciles of sales (within year province), 0 for the top 3 deciles, and missing otherwise. Triangle are triangular kernel weights.

    I would like to compare the coefficients on treatment to see if they are significantly different.

    1. The first part of my question is a Stata question: How do I properly code this comparison?

    What I have tried:

    I have found https://stats.idre.ucla.edu/stata/co...s-using-suest/, where the answer was to use suest. This does not appear to work with ivreg2; it throws errors for my triangular kernel weights and because ivreg2 doesn't cluster using vce(cluster).
    (specifically my errors are "ModelA was estimated with pweights, you should re-estimate using iweights" and "ModelA was estimated with a nonstandard vce (cluster)")

    Alternatively, I found https://www.stata.com/statalist/arch.../msg00487.html, which looked perfect.

    Code:
    gmm                                      ///
     (eq1: outcome - {b1}*treatment - {b2}*control1 - {b3}*control2 - {b0} if small == 1)       ///
     (eq2: outcome - {b1}*treatment - {b2}*control1 - {b3}*control2 - {b0} if small == 0),      ///
        instruments(eq1: instrument control1 control2)     ///
        instruments(eq2: instrument control1 control2)             ///
        onestep winitial(unadjusted, indep)

    But it seems that I am unable to use if statements within the equation definitions using gmm. My error is "could not evaluate equation 1 r(498)."

    Am I correct I cannot use these methods for my setting, or am I simply not setting them up correctly?


    2. The second part of my question is more of a statistical question, but is related to my first because it concerns another way I have tried to do sub-sample analysis:

    A final way I have found to do this is to estimate a pooled regressions including interactions with my small dummy

    Code:
    g treatxsmall = treatment*small
        g instrumentxsmall = instrument*small
        xi: ivreg2 outcome (treatment treatxsmall = instrument instrumentxsmall) small control1 control2 i.year i.province [pw = triangle] if small != ., cluster(clusteridi)
    Am I correct in believing that then the coefficient on treatxsmall will tell me if the difference between small firms and large firms is significant, i.e., could I report the p-value in the table as the p-value in the above table?

    From looking at the replication code included with the published version of the paper, that appears to be what the authors do, i.e., run the sub-sample regressions and report their coefficients and standard errors and then run the pooled regression and report the p-value on the interaction term between the subsample indicator and treatment. However, their code is quite advanced for my level (they appear to use nested Stata sub-routines they can call repeatedly ("program define...")) and I find it hard to back out what they are doing from code alone while the underlying data is confidential so I can't reverse-engineer it easily.

    I am using Stata/MP, version 14.2


    Thank you for your time, Ellen


  • #2
    First, ivreg2 is from SSC (you are asked to explain). Second, a data example increases your chances of obtaining timely and helpful replies. As long as you are dealing with the same regression but different samples, joint estimation is always possible. This enables you to bypass suest's nonstandard VCE restriction. I will illustrate how you can do this using ivregress which handles factor variables, but the procedure works for ivreg2 as well.

    Code:
    webuse hsng2
    *create 2 samples
    gen group= cond(_n<26, 1, 2)
    
    *separate regressions
    ivregress 2sls rent pcturban (hsngval = faminc i.region) if group==1, cluster(division)
    ivregress 2sls rent pcturban (hsngval = faminc i.region) if group==2, cluster(division)
    
    *Joint regression (interact variables with group variable). Note with 2 constant terms, we create our own
    
    gen cons=1
    ivregress 2sls rent c.pcturban#i.group c.cons#i.group (c.hsngval#i.group = (c.faminc i.region)#i.group), nocons cluster(division)
    
    *Now compare coefficients across groups, e.g.,
    test 1.group#c.hsngval=  2.group#c.hsngval

    Results:

    Code:
    . ivregress 2sls rent pcturban (hsngval = faminc i.region) if group==1, cluster(division)
    
    Instrumental variables (2SLS) regression          Number of obs   =         25
                                                      Wald chi2(2)    =     190.16
                                                      Prob > chi2     =     0.0000
                                                      R-squared       =     0.5444
                                                      Root MSE        =     27.437
    
                                   (Std. Err. adjusted for 8 clusters in division)
    ------------------------------------------------------------------------------
                 |               Robust
            rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         hsngval |   .0023382    .000492     4.75   0.000      .001374    .0033024
        pcturban |  -.3853024   1.092628    -0.35   0.724    -2.526813    1.756208
           _cons |   147.9727   50.88684     2.91   0.004     48.23637    247.7091
    ------------------------------------------------------------------------------
    Instrumented:  hsngval
    Instruments:   pcturban faminc 2.region 3.region 4.region
    
    . 
    . ivregress 2sls rent pcturban (hsngval = faminc i.region) if group==2, cluster(division)
    
    Instrumental variables (2SLS) regression          Number of obs   =         25
                                                      Wald chi2(2)    =     179.25
                                                      Prob > chi2     =     0.0000
                                                      R-squared       =     0.7033
                                                      Root MSE        =     15.006
    
                                   (Std. Err. adjusted for 9 clusters in division)
    ------------------------------------------------------------------------------
                 |               Robust
            rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         hsngval |   .0018066   .0003457     5.23   0.000      .001129    .0024842
        pcturban |    .487531    .350634     1.39   0.164    -.1996991    1.174761
           _cons |   114.5466   13.02833     8.79   0.000     89.01149    140.0816
    ------------------------------------------------------------------------------
    Instrumented:  hsngval
    Instruments:   pcturban faminc 2.region 3.region 4.region
    
    
    . 
    . ivregress 2sls rent c.pcturban#i.group c.cons#i.group (c.hsngval#i.group = (c.faminc i.region)#i.group), nocons cluster(divi
    > sion)
    note: 4.region#2.group dropped due to collinearity
    
    Instrumental variables (2SLS) regression          Number of obs   =         50
                                                      Wald chi2(6)    =          .
                                                      Prob > chi2     =          .
                                                      R-squared       =          .
                                                      Root MSE        =     22.113
    
                                       (Std. Err. adjusted for 9 clusters in division)
    ----------------------------------------------------------------------------------
                     |               Robust
                rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
     group#c.hsngval |
                  1  |   .0023382    .000492     4.75   0.000      .001374    .0033024
                  2  |   .0018066   .0003457     5.23   0.000      .001129    .0024842
                     |
    group#c.pcturban |
                  1  |  -.3853024   1.092628    -0.35   0.724    -2.526813    1.756208
                  2  |    .487531    .350634     1.39   0.164    -.1996991    1.174761
                     |
        group#c.cons |
                  1  |   147.9727   50.88684     2.91   0.004     48.23637    247.7091
                  2  |   114.5466   13.02833     8.79   0.000     89.01149    140.0816
    ----------------------------------------------------------------------------------
    Instrumented:  1b.group#c.hsngval 2.group#c.hsngval
    Instruments:   1b.group#c.pcturban 2.group#c.pcturban 1b.group#c.cons
                   2.group#c.cons 1b.group#c.faminc 2.group#c.faminc
                   1b.region#2.group 2.region#1b.group 2.region#2.group
                   3.region#1b.group 3.region#2.group 4.region#1b.group
    
    . 
    . test 1.group#c.hsngval=  2.group#c.hsngval
    
     ( 1)  1b.group#c.hsngval - 2.group#c.hsngval = 0
    
               chi2(  1) =    0.75
             Prob > chi2 =    0.3864

    Comment


    • #3
      Thank you Andrew,

      Please allow me to test my understanding of the joint regression:

      We don't include the main effects, so this is not the equivalent of the pooled sub-sample analysis via interaction terms. Thus, we can't interpret either interaction term of hsngval with group as the magnitude and significance of the difference across groups.

      Rather, we are running the equivalent of the seemingly unrelated regressions of suest: jointly estimating the two models separately. And thus we need to run the separate Chow test on the two coefficients.

      Is that correct?

      And this test is equivalent to running the pooled regression with an interaction term instead, taking group 1 as the reference/excluded group? In the example you gave, if I run:

      Code:
      webuse hsng2
      gen group = cond(_n<26, 1, 2)
      gen cons = 1
      
      ivregress 2sls rent pcturban group c.pcturban#i.group c.cons#i.group (hsngval c.hsngval#i.group = faminc i.region (c.faminc i.region)#i.group), nocons cluster(division)
      note: 2.group#c.cons omitted because of collinearity
      note: 4.region#2.group dropped due to collinearity
      
      Instrumental variables (2SLS) regression          Number of obs   =         50
                                                        Wald chi2(6)    =          .
                                                        Prob > chi2     =          .
                                                        R-squared       =          .
                                                        Root MSE        =     22.113
      
                                         (Std. Err. adjusted for 9 clusters in division)
      ----------------------------------------------------------------------------------
                       |               Robust
                  rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -----------------+----------------------------------------------------------------
               hsngval |   .0023382    .000492     4.75   0.000      .001374    .0033024                 |
       group#c.hsngval |
                    2  |  -.0005316   .0006137    -0.87   0.386    -.0017344    .0006713
                       |
              pcturban |  -.3853024   1.092628    -0.35   0.724    -2.526813    1.756208
                 group |   57.27328   6.514167     8.79   0.000     44.50575    70.04081
                       |
      group#c.pcturban |
                    2  |   .8728334     1.1577     0.75   0.451    -1.396216    3.141883
                       |
          group#c.cons |
                    1  |   90.69947   51.08289     1.78   0.076    -9.421159    190.8201
                    2  |          0  (omitted)
      ----------------------------------------------------------------------------------
      Instrumented:  hsngval 2.group#c.hsngval
      Instruments:   pcturban group 2.group#c.pcturban 1b.group#c.cons faminc 2.region
                     3.region 4.region 2.group#c.faminc 1b.region#2.group
                     2.region#2.group 3.region#2.group 4.region#2.group
      The the interaction is indeed insignificant, telling us the same story as the Chow test, but is this general?

      Thanks!
      Ellen

      Comment


      • #4
        Rather, we are running the equivalent of the seemingly unrelated regressions of suest: jointly estimating the two models separately. And thus we need to run the separate Chow test on the two coefficients.

        Is that correct?
        Yes, that is correct.

        And this test is equivalent to running the pooled regression with an interaction term instead, taking group 1 as the reference/excluded group? In the example you gave, if I run:
        Your idea is spot on but there is a minor glitch in your implementation. We do not need a separate constant in the pooled regression with interactions. The P-values of the interaction term and Wald test must be equivalent. Here is the corrected version yielding a P-value of 0.386.

        Code:
        . webuse hsng2
        (1980 Census housing data)
        
        . gen group = cond(_n<26, 1, 2)
        
        . gen cons = 1
        
         ivregress 2sls rent pcturban group c.pcturban#i.group (hsngval c.hsngval#i.group = faminc i.region (c.fa
        > minc i.region)#i.group), cluster(division)
        note: 4.region#2.group dropped due to collinearity
        
        Instrumental variables (2SLS) regression          Number of obs   =         50
                                                          Wald chi2(5)    =    1050.04
                                                          Prob > chi2     =     0.0000
                                                          R-squared       =     0.6008
                                                          Root MSE        =     22.113
        
                                           (Std. Err. adjusted for 9 clusters in division)
        ----------------------------------------------------------------------------------
                         |               Robust
                    rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -----------------+----------------------------------------------------------------
                 hsngval |   .0023382    .000492     4.75   0.000      .001374    .0033024
                         |
         group#c.hsngval |
                      2  |  -.0005316   .0006137    -0.87   0.386    -.0017344    .0006713
                         |
                pcturban |  -.3853024   1.092628    -0.35   0.724    -2.526813    1.756208
                   group |  -33.42619   52.09915    -0.64   0.521    -135.5387    68.68627
                         |
        group#c.pcturban |
                      2  |   .8728334     1.1577     0.75   0.451    -1.396216    3.141883
                         |
                   _cons |   181.3989   102.1658     1.78   0.076    -18.84232    381.6402
        ----------------------------------------------------------------------------------
        Instrumented:  hsngval 2.group#c.hsngval
        Instruments:   pcturban group 2.group#c.pcturban faminc 2.region 3.region
                       4.region 2.group#c.faminc 1b.region#2.group 2.region#2.group
                       3.region#2.group
        
        .

        Comment


        • #5
          I understand now. Thank you for a very helpful and clear explanation Andrew. Ellen

          Comment


          • #6
            Dear Andrew Musau,

            I am trying to solve the same issue but instead of a continuous endogenous variable, I have a dummy variable that I want to instrument, .i.e. in the example you discussed here my "hsngval" variable is binary (0 and 1). When I run the code you suggested, in the results I get that (1.hsngval#2.group) and (1.hsngval#1.group) are omitted because of collinearity and I have a coefficient only for (0.hsngval#1.group). Can I interpret the coefficient as the difference between small and large firms and the significance of this coefficient as the significance of the difference between small and large firms?

            Thank you very much in advance!

            Yuliya






            Comment


            • #7
              For a 0/1 dummy, just treat the variable as continuous as you cannot have both categories in the regression at the same time. Doing so will just result in one category being collinear with the intercepts in the joint regression. Below, notice that it does not matter if I consider the variable as categorical and specify the base as 0 and if I consider the variable as continuous.

              Code:
              webuse hsng2
              *create 2 samples
              gen group= cond(_n<26, 1, 2)
              
              qui sum hsngval, d
              gen hihsngval= hsngval>`r(p50)'
              
              *separate regressions (categorical with base 0)
              ivregress 2sls rent pcturban (ib0.hihsngval = faminc i.region) if group==1, cluster(division)
              ivregress 2sls rent pcturban (ib0.hihsngval = faminc i.region) if group==2, cluster(division)
              
              *separate regressions (continuous)
              ivregress 2sls rent pcturban (c.hihsngval = faminc i.region) if group==1, cluster(division)
              ivregress 2sls rent pcturban (c.hihsngval = faminc i.region) if group==2, cluster(division)
              Res.:

              Code:
              . *separate regressions (categorical with base 0)
              
              . ivregress 2sls rent pcturban (ib0.hihsngval = faminc i.region) if group==1, cluster(division)
              
              Instrumental variables (2SLS) regression          Number of obs   =         25
                                                                Wald chi2(2)    =      28.14
                                                                Prob > chi2     =     0.0000
                                                                R-squared       =     0.1893
                                                                Root MSE        =     36.597
              
                                             (Std. Err. adjusted for 8 clusters in division)
              ------------------------------------------------------------------------------
                           |               Robust
                      rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
               1.hihsngval |    102.738   56.46556     1.82   0.069    -7.932433    213.4085
                  pcturban |  -.7900045   1.561238    -0.51   0.613    -3.849974    2.269965
                     _cons |   244.3666   88.91957     2.75   0.006      70.0874    418.6457
              ------------------------------------------------------------------------------
              Instrumented:  1.hihsngval
              Instruments:   pcturban faminc 2.region 3.region 4.region
              
              . ivregress 2sls rent pcturban (ib0.hihsngval = faminc i.region) if group==2, cluster(division)
              
              Instrumental variables (2SLS) regression          Number of obs   =         25
                                                                Wald chi2(2)    =     204.79
                                                                Prob > chi2     =     0.0000
                                                                R-squared       =     0.4365
                                                                Root MSE        =     20.679
              
                                             (Std. Err. adjusted for 9 clusters in division)
              ------------------------------------------------------------------------------
                           |               Robust
                      rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
               1.hihsngval |   39.42183   11.22963     3.51   0.000     17.41216     61.4315
                  pcturban |   .4301993   .5332451     0.81   0.420    -.6149419     1.47534
                     _cons |   181.9472   29.65598     6.14   0.000     123.8225    240.0718
              ------------------------------------------------------------------------------
              Instrumented:  1.hihsngval
              Instruments:   pcturban faminc 2.region 3.region 4.region
              
              .
              . *separate regressions (continuous)
              
              . ivregress 2sls rent pcturban (c.hihsngval = faminc i.region) if group==1, cluster(division)
              
              Instrumental variables (2SLS) regression          Number of obs   =         25
                                                                Wald chi2(2)    =      28.14
                                                                Prob > chi2     =     0.0000
                                                                R-squared       =     0.1893
                                                                Root MSE        =     36.597
              
                                             (Std. Err. adjusted for 8 clusters in division)
              ------------------------------------------------------------------------------
                           |               Robust
                      rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                 hihsngval |    102.738   56.46556     1.82   0.069    -7.932433    213.4085
                  pcturban |  -.7900045   1.561238    -0.51   0.613    -3.849974    2.269965
                     _cons |   244.3666   88.91957     2.75   0.006      70.0874    418.6457
              ------------------------------------------------------------------------------
              Instrumented:  hihsngval
              Instruments:   pcturban faminc 2.region 3.region 4.region
              
              . ivregress 2sls rent pcturban (c.hihsngval = faminc i.region) if group==2, cluster(division)
              
              Instrumental variables (2SLS) regression          Number of obs   =         25
                                                                Wald chi2(2)    =     204.79
                                                                Prob > chi2     =     0.0000
                                                                R-squared       =     0.4365
                                                                Root MSE        =     20.679
              
                                             (Std. Err. adjusted for 9 clusters in division)
              ------------------------------------------------------------------------------
                           |               Robust
                      rent |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                 hihsngval |   39.42183   11.22963     3.51   0.000     17.41216     61.4315
                  pcturban |   .4301993   .5332451     0.81   0.420    -.6149419     1.47534
                     _cons |   181.9472   29.65598     6.14   0.000     123.8225    240.0718
              ------------------------------------------------------------------------------
              Instrumented:  hihsngval
              Instruments:   pcturban faminc 2.region 3.region 4.region
              Last edited by Andrew Musau; 11 Jun 2020, 04:57.

              Comment


              • #8
                Dear Andrew Musau Thank you so much for your help!!

                Comment


                • #9
                  Andrew Musau
                  Hello! I have read many of your posts, but my issue did not get solved.
                  Hence, posting it here
                  I am using the user writtten command - cmp.
                  And i wished to conduct a hausman test to see if random or fixed effects is better for my study.
                  however, on using hausman i got the error it can't be used with p-weighted data, which i guess cmp uses.
                  Then i decided to go for suest.
                  However, whenever i run suest , i get the error "was estimated with a nonstandard vce( robus)t"
                  I wonder why do i get this error, when i didn't specify anything as such.
                  Whats the solution for the same?
                  Thanks in advance

                  Comment


                  • #10
                    Are these linear fixed and random effects models? Usually, with robust VCE, you want to use a test of overidentifying restrictions (implemented by xtoverid from SSC) to choose between random and fixed effects. However, because this will not work with cmp(SSC), you can implement it by means of an artificial regression as illustrated in the link below. Just adapt the procedure to cmp.

                    https://www.statalist.org/forums/for...scoll-kraay-se

                    Otherwise, maybe the author of cmp David Roodman may have other suggestions.

                    Comment


                    • #11
                      Andrew Musau Thanks! can you provide me a reference to the procedure?

                      Comment


                      • #12
                        You will find references at

                        Code:
                        *ssc install xtoverid
                        help xtoverid

                        Comment


                        • #13
                          @Andrew Musau: can I also use this, when I have the same control, but two different independent variables?
                          Equation 1: ivreg2 outcome1 control1 control2 i.time i.region (control3 = instrument), partial (i.time i.region) cluster(region)
                          Equation 2: ivreg2 outcome1 control1 control2 i.time i.region (control3 = instrument), partial (i.time i.region) cluster(region)

                          Comment


                          • #14
                            Your Eq. 1 and Eq. 2 are identical, as far as I can see. From an estimation perspective, there is no difference in the treatment of control variables and independent variables. These are only terms applicable to your research question, so anything that applies to independent variables automatically applies to control variables. That is, unless you are confusing the terms.

                            Comment


                            • #15
                              Hi Andrew, I'm trying to use this procedure for panel data & year fixed effects. Would I have to make an adjustment for the year fixed effects (i.e. i.group#i.year) or leave the year fixed effects as is? (i.e. i.year)

                              Comment

                              Working...
                              X