Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Instrumental variable Probit regression using the cmp command

    Hi everyone,

    I want to investigate the effect that Financial Knowledge (FK, mesaured as the number of correct answers to six knowledge questions; ranges from 0-6) has on a binary variable describing whether or not a person owns a retirement fund (RET: 0, if the person does not own a retirement fund; 1, if the person owns a retirement fund)

    First, I estimated a usual Probit regression and calculated the marginal effects at the mean

    Code:
    probit RET FK (....  + 11 control variables CV1-CV11)
    margins, dydx(FK) atmeans
    I get a marginal effect of ~3% and a z-value of 15, which seems plausible to me.

    In a next step, I want to conduct an instrumental variable probit regression and compare the results with the results above. I use the answer to the question "Were you ever required to take financial education?" (FE) as an instrument for FK and estimate marginal effects at the mean again:

    Code:
    cmp(RET=FK CV1-CV11) (FK=CV1-CV11 FE), ind($cmp_probit $cmp_cont)
    margins, dydx(FK) atmeans
    Now, I get a marginal effect of ~55% and a z-value of 10. I must have messed something up with the cmp command, since this marginal effect would tell me that someone with FK=0 has a 300% lower probability to have a retirement fund than someone with FK=6, right?

    I am very much looking forward to your answers. Thanks in advance!

    Best regards,

    Vincent

  • #2
    Since your margins commands do not specify the statistic that ]margins is computing marginal effects on, I believe margins is taking whatever default predict takes. After probit, predict defaults to the computed probability of a positive outcome ("pr"). Because cmp is a much more general command, it defaults to "xb". So after cmp I think you need to do
    Code:
     
     margins, dydx(FK) atmeans predict(pr)

    Comment


    • #3
      Thank you so much for the response, David. It was really helpful!

      Comment


      • #4
        I want to estimate a switch_probit model inside the cmp command and I can't figure out how. Could you give me some indication?


        Comment


        • #5
          Hi, I am now doing a similar regression: using probit model to regress CHI"whether to purchase conmmercial health insurance (0;1)" on "Self-rated health status (0;1;2;3;4;5)", and using the family members' average health status (a continuous variable) as a IV to regress "Self-rated health status (0;1;2;3;4;5)" on IV and other control variables in the second stage. Previously, I tried ivprobit but finally I found the stata help only allow continuous EEV, but our situation is categorical EV, so I am now using cmp: ind($cmp_oprobit $cmp_probit), combining my first stage probit and my 2nd stage ordered probit model. But my Marginal effect is blank. How to deal with this problem?
          Also I noted the warning "regressor matrix for _cmp_y1 equation appears ill-conditioned" but I check other posts, the answer is "If you end up with a fitting full model, you can ignore it". But I am still curious.
          Code:
           cmp(HealthLevel = ChildHealthLevel Age Age_2 Gender Location MiddleSchoolEdu HighSchoolEdu Col
          > legeEdu MaritalStatus log_HouseholdFinancialSituation) (CHI=HealthLevel Age Age_2 Gender Locat
          > ion MiddleSchoolEdu HighSchoolEdu CollegeEdu MaritalStatus log_HouseholdFinancialSituation) ,i
          > nd($cmp_oprobit $cmp_probit) 
          
          Fitting individual models as starting point for full model fit.
          Note: For programming reasons, these initial estimates may deviate from your specification.
                For exact fits of each equation alone, run cmp separately on each.
          
          Iteration 0:   log likelihood = -20200.392  
          Iteration 1:   log likelihood = -19335.188  
          Iteration 2:   log likelihood = -19333.825  
          Iteration 3:   log likelihood = -19333.825  
          
          Ordered probit regression                              Number of obs =  14,900
                                                                 LR chi2(10)   = 1733.13
                                                                 Prob > chi2   =  0.0000
          Log likelihood = -19333.825                            Pseudo R2     =  0.0429
          
          -----------------------------------------------------------------------------------------------
                                _cmp_y1 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          ------------------------------+----------------------------------------------------------------
                       ChildHealthLevel |   .4022039   .0124549    32.29   0.000     .3777927    .4266152
                                    Age |  -.0438822   .0146328    -3.00   0.003     -.072562   -.0152024
                                  Age_2 |   .0002635   .0001173     2.25   0.025     .0000335    .0004935
                                 Gender |  -.1424186   .0182865    -7.79   0.000    -.1782595   -.1065777
                               Location |  -.0992587   .0223739    -4.44   0.000    -.1431107   -.0554067
                        MiddleSchoolEdu |   .0967272   .0225997     4.28   0.000     .0524326    .1410218
                          HighSchoolEdu |   .0710423   .0326479     2.18   0.030     .0070535    .1350311
                             CollegeEdu |   .0982171   .0689609     1.42   0.154    -.0369438    .2333781
                          MaritalStatus |   .0426419   .0298147     1.43   0.153    -.0157939    .1010776
          log_HouseholdFinancialSitua~n |  -.0038698   .0096938    -0.40   0.690    -.0228694    .0151298
          ------------------------------+----------------------------------------------------------------
                                  /cut1 |  -1.938607   .4690852                     -2.857997   -1.019217
                                  /cut2 |  -.9735017   .4688245                     -1.892381   -.0546225
                                  /cut3 |   .4483023   .4688094                     -.4705472    1.367152
                                  /cut4 |   .9706602   .4689021                      .0516288    1.889691
          -----------------------------------------------------------------------------------------------
          
          Warning: regressor matrix for _cmp_y1 equation appears ill-conditioned. (Condition number = 9158
          > .5296.)
          This might prevent convergence. If it does, and if you have not done so already, you may need to
          >  remove nearly
          collinear regressors to achieve convergence. Or you may need to add a nrtolerance(#) or nonrtole
          > rance option to the command line.
          See cmp tips.
          
          Iteration 0:   log likelihood = -2237.4403  
          Iteration 1:   log likelihood = -2024.4443  
          Iteration 2:   log likelihood = -2003.2967  
          Iteration 3:   log likelihood = -2002.7664  
          Iteration 4:   log likelihood = -2002.7663  
          
          Probit regression                                       Number of obs = 15,048
                                                                  LR chi2(10)   = 469.35
                                                                  Prob > chi2   = 0.0000
          Log likelihood = -2002.7663                             Pseudo R2     = 0.1049
          
          -----------------------------------------------------------------------------------------------
                                    CHI | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          ------------------------------+----------------------------------------------------------------
                            HealthLevel |   .0818536   .0209727     3.90   0.000      .040748    .1229593
                                    Age |   .0608114   .0434706     1.40   0.162    -.0243894    .1460122
                                  Age_2 |  -.0008223   .0003667    -2.24   0.025    -.0015411   -.0001035
                                 Gender |   .0602502   .0433326     1.39   0.164      -.02468    .1451805
                               Location |  -.1696783   .0491574    -3.45   0.001     -.266025   -.0733316
                        MiddleSchoolEdu |   .1562291   .0511454     3.05   0.002      .055986    .2564722
                          HighSchoolEdu |   .1618143    .063908     2.53   0.011      .036557    .2870716
                             CollegeEdu |   .2059546   .1105907     1.86   0.063    -.0107992    .4227085
                          MaritalStatus |  -.1201858    .078989    -1.52   0.128    -.2750014    .0346297
          log_HouseholdFinancialSitua~n |   .1450394   .0242412     5.98   0.000     .0975275    .1925513
                                  _cons |  -4.044731   1.310128    -3.09   0.002    -6.612534   -1.476928
          -----------------------------------------------------------------------------------------------
          
          Warning: regressor matrix for CHI equation appears ill-conditioned. (Condition number = 8315.447
          > 5.)
          This might prevent convergence. If it does, and if you have not done so already, you may need to
          >  remove nearly
          collinear regressors to achieve convergence. Or you may need to add a nrtolerance(#) or nonrtole
          > rance option to the command line.
          See cmp tips.
          
          Fitting constant-only model for LR test of overall model fit.
          
          Fitting full model.
          
          Iteration 0:   log likelihood = -21336.457  
          Iteration 1:   log likelihood = -21334.505  
          Iteration 2:   log likelihood = -21334.098  
          Iteration 3:   log likelihood = -21334.098  
          
          Mixed-process regression                               Number of obs =  15,048
                                                                 LR chi2(20)   = 2144.04
          Log likelihood = -21334.098                            Prob > chi2   =  0.0000
          
          -----------------------------------------------------------------------------------------------
                                        | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          ------------------------------+----------------------------------------------------------------
          HealthLevel                   |
                       ChildHealthLevel |   .4026554   .0124415    32.36   0.000     .3782706    .4270402
                                    Age |  -.0439128   .0146325    -3.00   0.003     -.072592   -.0152337
                                  Age_2 |   .0002638   .0001173     2.25   0.025     .0000338    .0004937
                                 Gender |  -.1425752   .0182859    -7.80   0.000    -.1784149   -.1067356
                               Location |  -.0993239   .0223732    -4.44   0.000    -.1431746   -.0554731
                        MiddleSchoolEdu |   .0966809   .0225989     4.28   0.000     .0523879    .1409739
                          HighSchoolEdu |   .0710038    .032648     2.17   0.030     .0070149    .1349928
                             CollegeEdu |   .0977076   .0689444     1.42   0.156    -.0374209     .232836
                          MaritalStatus |   .0424211   .0298114     1.42   0.155    -.0160082    .1008504
          log_HouseholdFinancialSitua~n |  -.0038927   .0096934    -0.40   0.688    -.0228913     .015106
          ------------------------------+----------------------------------------------------------------
          CHI                           |
                            HealthLevel |    .225625   .0658467     3.43   0.001     .0965679     .354682
                                    Age |   .0662264   .0430229     1.54   0.124    -.0180968    .1505497
                                  Age_2 |  -.0008426   .0003625    -2.32   0.020     -.001553   -.0001321
                                 Gender |   .0798693   .0437505     1.83   0.068      -.00588    .1656187
                               Location |   -.151449    .049477    -3.06   0.002     -.248422   -.0544759
                        MiddleSchoolEdu |   .1343361   .0516982     2.60   0.009     .0330094    .2356627
                          HighSchoolEdu |    .148445   .0636813     2.33   0.020     .0236319    .2732582
                             CollegeEdu |   .1842894   .1102341     1.67   0.095    -.0317654    .4003442
                          MaritalStatus |  -.1280444   .0781895    -1.64   0.102     -.281293    .0252041
          log_HouseholdFinancialSitua~n |   .1411404    .024095     5.86   0.000      .093915    .1883659
                                  _cons |   -4.69991   1.322697    -3.55   0.000    -7.292349   -2.107472
          ------------------------------+----------------------------------------------------------------
                               /cut_1_1 |  -1.938199   .4690687    -4.13   0.000    -2.857557   -1.018841
                               /cut_1_2 |  -.9734641   .4688088    -2.08   0.038    -1.892312   -.0546158
                               /cut_1_3 |   .4482362   .4687943     0.96   0.339    -.4705838    1.367056
                               /cut_1_4 |   .9708707   .4688867     2.07   0.038     .0518697    1.889872
                           /atanhrho_12 |  -.1701613   .0759563    -2.24   0.025    -.3190328   -.0212897
          ------------------------------+----------------------------------------------------------------
                                 rho_12 |  -.1685377   .0737987                     -.3086321   -.0212865
          -----------------------------------------------------------------------------------------------
          
          . margins, dydx(HealthLevel) atmeans
          
          Conditional marginal effects                            Number of obs = 14,900
          Model VCE: OIM
          
          Expression: Linear prediction, predict()
          dy/dx wrt:  HealthLevel
          At: ChildHealthLevel                = 4.023078 (mean)
              Age                             = 61.46597 (mean)
              Age_2                           = 3852.233 (mean)
              Gender                          = .5267114 (mean)
              Location                        = 1.749463 (mean)
              MiddleSchoolEdu                 = .3622148 (mean)
              HighSchoolEdu                   = .1277852 (mean)
              CollegeEdu                      = .0192617 (mean)
              MaritalStatus                   = .8943624 (mean)
              log_HouseholdFinancialSituation = 10.28438 (mean)
              HealthLevel                     = 3.061141 (mean)
          
          ------------------------------------------------------------------------------
                       |            Delta-method
                       |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
           HealthLevel |          0  (omitted)
          ------------------------------------------------------------------------------

          Comment

          Working...
          X