Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • instrumental variable (testing for relevance and exogeniety)

    Hi, I am testing possible instruments for total expenditures (totex). Based from the first stage regression, I get that the instruments employed_pay and employed_prof are both significant, which is good. I also have a good results for the underidentification test (P-value<.05) and Cragg Donald statistic>20, Sargan test statistic (P-value>.05).

    I would like to know if the test of endogenous regressor/s (the endog option) means testing the endogeneity of totex or endogeneity of the instruments? Then what is the desired outcome, reject the Ho?

    Many thanks!
    -----------------------------------------------------------------------------------------

    . ivreg2 ln_q_electricity tariff hhtype tenure urb tv_qty ref_qty aircon_qty (totex=employed_pay employed_pro
    > f), endog (totex) first

    First-stage regressions
    -----------------------

    First-stage regression of totex:

    Statistics consistent for homoskedasticity only
    Number of obs = 7019
    -------------------------------------------------------------------------------
    totex | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    employed_pay | 26106.76 1539.329 16.96 0.000 23089.21 29124.31
    employed_prof | 16038.71 2251.38 7.12 0.000 11625.33 20452.1
    tariff | 15781.52 1042.326 15.14 0.000 13738.25 17824.8
    hhtype | 10057.52 3520.049 2.86 0.004 3157.163 16957.88
    tenure | -7696.789 1190.623 -6.46 0.000 -10030.77 -5362.809
    urb | -47798.76 3503.213 -13.64 0.000 -54666.12 -40931.41
    tv_qty | 57219.38 3139.981 18.22 0.000 51064.06 63374.69
    ref_qty | 74383.39 3497.787 21.27 0.000 67526.67 81240.11
    aircon_qty | 152554.1 3444.159 44.29 0.000 145802.5 159305.7
    _cons | -30398.55 12892.06 -2.36 0.018 -55670.89 -5126.215
    -------------------------------------------------------------------------------
    F test of excluded instruments:
    F( 2, 7009) = 154.37
    Prob > F = 0.0000
    Sanderson-Windmeijer multivariate F test of excluded instruments:
    F( 2, 7009) = 154.37
    Prob > F = 0.0000



    Summary results for first-stage regressions
    -------------------------------------------

    (Underid) (Weak id)
    Variable | F( 2, 7009) P-val | SW Chi-sq( 2) P-val | SW F( 2, 7009)
    totex | 154.37 0.0000 | 309.17 0.0000 | 154.37

    Stock-Yogo weak ID F test critical values for single endogenous regressor:
    10% maximal IV size 19.93
    15% maximal IV size 11.59
    20% maximal IV size 8.75
    25% maximal IV size 7.25
    Source: Stock-Yogo (2005). Reproduced by permission.
    NB: Critical values are for Sanderson-Windmeijer F statistic.

    Underidentification test
    Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
    Ha: matrix has rank=K1 (identified)
    Anderson canon. corr. LM statistic Chi-sq(2)=296.13 P-val=0.0000

    Weak identification test
    Ho: equation is weakly identified
    Cragg-Donald Wald F statistic 154.37

    Stock-Yogo weak ID test critical values for K1=1 and L1=2:
    10% maximal IV size 19.93
    15% maximal IV size 11.59
    20% maximal IV size 8.75
    25% maximal IV size 7.25
    Source: Stock-Yogo (2005). Reproduced by permission.

    Weak-instrument-robust inference
    Tests of joint significance of endogenous regressors B1 in main equation
    Ho: B1=0 and orthogonality conditions are valid
    Anderson-Rubin Wald test F(2,7009)= 3.60 P-val=0.0272
    Anderson-Rubin Wald test Chi-sq(2)= 7.22 P-val=0.0271
    Stock-Wright LM S statistic Chi-sq(2)= 7.21 P-val=0.0272

    Number of observations N = 7019
    Number of regressors K = 9
    Number of endogenous regressors K1 = 1
    Number of instruments L = 10
    Number of excluded instruments L1 = 2

    IV (2SLS) estimation
    --------------------

    Estimates efficient for homoskedasticity only
    Statistics consistent for homoskedasticity only

    Number of obs = 7019
    F( 8, 7010) = 1372.63
    Prob > F = 0.0000
    Total (centered) SS = 9360.700607 Centered R2 = 0.6217
    Total (uncentered) SS = 303513.0491 Uncentered R2 = 0.9883
    Residual SS = 3541.182552 Root MSE = .7103

    ------------------------------------------------------------------------------
    ln_q_elect~y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    totex | 7.17e-07 2.90e-07 2.47 0.013 1.49e-07 1.28e-06
    tariff | -.0249502 .0068365 -3.65 0.000 -.0383495 -.0115509
    hhtype | .1754629 .0187527 9.36 0.000 .1387084 .2122175
    tenure | -.06008 .0063593 -9.45 0.000 -.0725439 -.047616
    urb | -.4548277 .022543 -20.18 0.000 -.4990113 -.4106442
    tv_qty | .3109113 .0239399 12.99 0.000 .26399 .3578326
    ref_qty | .8009556 .027277 29.36 0.000 .7474935 .8544176
    aircon_qty | .2123098 .0477579 4.45 0.000 .118706 .3059136
    _cons | 6.313296 .0643445 98.12 0.000 6.187183 6.439409
    ------------------------------------------------------------------------------
    Underidentification test (Anderson canon. corr. LM statistic): 296.129
    Chi-sq(2) P-val = 0.0000
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic): 154.366
    Stock-Yogo weak ID test critical values: 10% maximal IV size 19.93
    15% maximal IV size 11.59
    20% maximal IV size 8.75
    25% maximal IV size 7.25
    Source: Stock-Yogo (2005). Reproduced by permission.
    ------------------------------------------------------------------------------
    Sargan statistic (overidentification test of all instruments): 1.654
    Chi-sq(1) P-val = 0.1984
    -endog- option:
    Endogeneity test of endogenous regressors: 12.701
    Chi-sq(1) P-val = 0.0004
    Regressors tested: totex
    ------------------------------------------------------------------------------
    Instrumented: totex
    Included instruments: tariff hhtype tenure urb tv_qty ref_qty aircon_qty
    Excluded instruments: employed_pay employed_prof
    ------------------------------------------------------------------------------

  • #2
    Heyho

    You' re testing the null that the specified regressor is exogenous.
    ​​​​

    Comment


    • #3
      Many thanks Fred! I have a follow up question, if my endogenous variable is in log form, do I need to change the instruments to log form, too?

      Comment


      • #4
        no. it depends on your theoretical model.
        just like you can run a regression of wages on education (level level), or run a regression of log wages on education, or of wages on log education, or log wages on log education, etc.
        The same holds for the first stage. if the dependent variable (the suspected endogenous variable) is logged, it does not follow that the independent variables (the instruments) for it are logged.

        Comment


        • #5
          ok, correct me if i'm wrong, the first regression checks the validity of the instruments so it does not really matter if the instruments are not in log form.
          i have this eqn below: ttotex is total expenditure.

          ivreg2 ln_q_electricity tariff fsize hhtype1 bldg_type1 hgc1 hgc2 hgc3 tenure1 urb aircon_qty pc_qty ref_qty tv_qty cellphone_qty (ln_ttotex=wages employed_prof),endog (ln_ttotex) first

          Comment


          • #6
            It does matter, but as I said - it depends on your underlying model. a model where y=b*x is not the same as a model where log(y)=b*x or log(y)=b*log(x). in your case the first stage is log(x)=b*wages+c*employed_prof+d*covars.
            Without looking into it too much - this(log-level) seems like a very valid model. is it more or less "valid" than log-log? depends on what is the relation between the variables you want to estimate. but as I said, if the dependent variable is logged, it does not follow that the independnet variables have to be in some functional form, logged, squared, or otherwise.

            Comment


            • #7
              ok, thanks. the log-level makes sense in the first equation with wages and number of employed professionals as instruments.
              we initially have a level-level in the first and second equations but for some reason, we switched to log(totex) in the second equation log(electricity)=alpha+beta1*tariff+beta2*log(tote x)+...

              Comment


              • #8
                Many thanks Ariel!

                Comment

                Working...
                X