Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Questions about Poisson regression by pseudo maximum likelihood (PML) regression table

    Hi, Guys,

    I am doing a regression using ppmlhdfe command, me regression command is:

    ppmlhdfe p tre_1 tre_2 tre_3 tre_4, noconstant absorb(j#hs_code_6 j#yrm_doc hs_code_6#yrm_doc) sep(none) itol(1e-5) tol(1e-5) cluster(case_id hts_code_8 )

    where
    p represents the import price,
    tre_1 represents dummy variable that a duty imposed on one product for one country (==1)
    tre_2 represents dummy variable that there is no duty imposed on one product for one country anymore (==1)
    tre_3 represents dummy variable that for the same product in tre_1 BUT for other country instead of the country in tre_1 (==1)
    tre_4 represents dummy variable that for the same product in tre_2 BUT for other country instead of the country in tre_2 (==1)

    the fixed effect I added are: exported country (j)#hts code at 6 digits (hts_code_6), exported country (j)#time, hts_code_6#time
    the cluster I included are the product-country pair variable (case_id) and hts code at 8 digits (hts_code_8)

    But when I run the command, I noticed several things that is strange for me to understand, below is the regression table I have:



    Iteration 1: deviance = 3.1562e+13 eps = . iters = 14 tol = 1.0e-04 min(eta) = -7.58 PS
    Iteration 2: deviance = 1.8249e+13 eps = 7.30e-01 iters = 10 tol = 1.0e-04 min(eta) = -9.54 S
    Iteration 3: deviance = 1.5405e+13 eps = 1.85e-01 iters = 8 tol = 1.0e-04 min(eta) = -11.84 S
    Iteration 4: deviance = 1.5017e+13 eps = 2.58e-02 iters = 7 tol = 1.0e-04 min(eta) = -14.27 S
    Iteration 5: deviance = 1.4985e+13 eps = 2.09e-03 iters = 6 tol = 1.0e-04 min(eta) = -16.36 S
    Iteration 6: deviance = 1.4982e+13 eps = 2.14e-04 iters = 5 tol = 1.0e-04 min(eta) = -18.83 S
    Iteration 7: deviance = 1.4982e+13 eps = 4.37e-05 iters = 4 tol = 1.0e-04 min(eta) = -21.82 S
    Iteration 8: deviance = 1.4981e+13 eps = 1.08e-05 iters = 17 tol = 1.0e-05 min(eta) = -24.81 S
    Iteration 9: deviance = 1.4981e+13 eps = 2.88e-06 iters = 32 tol = 1.0e-06 min(eta) = -27.81 S O
    ----------------------------------------------------------------------------------------------------------
    > --
    (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
    Converged in 9 iterations and 103 HDFE sub-iterations (tol = 1.0e-05)
    Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.

    HDFE PPML regression No. of obs = 12704328
    Absorbing 3 HDFE groups Residual df = 582
    Statistics robust to heteroskedasticity Wald chi2(4) = 74.62
    Deviance = 1.49813e+13 Prob > chi2 = 0.0000
    Log pseudolikelihood = -7.49069e+12 Pseudo R2 = 0.6644

    Number of clusters (case_id)= 583
    Number of clusters (hs_code_8)= 2,005
    (Std. Err. adjusted for 583 clusters in case_id hs_code_8)
    ------------------------------------------------------------------------------
    | Robust
    v | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    tre_1 | .5002759 .1419406 3.52 0.000 .2220774 .7784745
    tre_2 | .4012424 .1424712 2.82 0.005 .1220039 .6804808
    tre_3 | .5654237 .1174388 4.81 0.000 .3352479 .7955995
    tre_4 | .4747264 .1384631 3.43 0.001 .2033437 .7461091
    _cons | 15.22352 .0520799 292.31 0.000 15.12145 15.3256
    ------------------------------------------------------------------------------

    Absorbed degrees of freedom:
    -------------------------------------------------------------+
    Absorbed FE | Categories - Redundant = Num. Coefs |
    ---------------------+---------------------------------------|
    j#hs_code_6 | 29361 0 29361 |
    j#yrm_doc | 39480 310 39170 |
    hs_code_6#yrm_doc | 38994 674 38320 ?|
    -------------------------------------------------------------+
    ? = number of redundant parameters may be higher
    The first question is: the color of min(bta) value after the Iteration 3 is becoming red, is it something wrong with my regression command or my data?

    The second question is for the absorbed degrees of freedom: why there is a question mark “?” behind the third fixed effect? what does the number of redundant parameters may be higher mean in here?





    THEN I tried different clusters. Since there are reviews for determining whether to impose duty on the product of one country every several years, so reviews for specific country-product pair might affect each other, so I instead include the sequence of the reviews (no_review=="1" as 1st review, no_review==“2” as 2nd review, or so) for clustering, together with exported country variable(j), the regression became:

    ppmlhdfe p tre_1 tre_2 tre_3 tre_4, noconstant absorb(j#hs_code_6 j#yrm_doc hs_code_6#yrm_doc) sep(none) itol(1e-5) tol(1e-5) cluster(j no_review)


    BUT the regression table presents other issues:

    Iteration 1: deviance = 3.1562e+13 eps = . iters = 14 tol = 1.0e-04 min(eta) = -7.58 PS
    Iteration 2: deviance = 1.8249e+13 eps = 7.30e-01 iters = 10 tol = 1.0e-04 min(eta) = -9.54 S
    Iteration 3: deviance = 1.5405e+13 eps = 1.85e-01 iters = 8 tol = 1.0e-04 min(eta) = -11.84 S
    Iteration 4: deviance = 1.5017e+13 eps = 2.58e-02 iters = 7 tol = 1.0e-04 min(eta) = -14.27 S
    Iteration 5: deviance = 1.4985e+13 eps = 2.09e-03 iters = 6 tol = 1.0e-04 min(eta) = -16.36 S
    Iteration 6: deviance = 1.4982e+13 eps = 2.14e-04 iters = 5 tol = 1.0e-04 min(eta) = -18.83 S
    Iteration 7: deviance = 1.4982e+13 eps = 4.37e-05 iters = 4 tol = 1.0e-04 min(eta) = -21.82 S
    Iteration 8: deviance = 1.4981e+13 eps = 1.08e-05 iters = 17 tol = 1.0e-05 min(eta) = -24.81 S
    Iteration 9: deviance = 1.4981e+13 eps = 2.88e-06 iters = 32 tol = 1.0e-06 min(eta) = -27.81 S O
    ------------------------------------------------------------------------------------------------------------
    (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
    Converged in 9 iterations and 103 HDFE sub-iterations (tol = 1.0e-05)
    Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
    warning: missing F statistic; dropped variables due to collinearity or too few clusters

    HDFE PPML regression No. of obs = 12704328
    Absorbing 3 HDFE groups Residual df = 3
    Statistics robust to heteroskedasticity Wald chi2(4) = .
    Deviance = 1.49813e+13 Prob > chi2 = .
    Log pseudolikelihood = -7.49069e+12 Pseudo R2 = 0.6644

    Number of clusters (j) = 223
    Number of clusters (no_review)= 4
    (Std. Err. adjusted for 4 clusters in j no_review)
    ------------------------------------------------------------------------------
    | Robust
    v | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    tre_1 | .5002759 .0716396 6.98 0.000 .3598648 .640687
    tre_2 | .4012424 .1218577 3.29 0.001 .1624057 .640079
    tre_3 | .5654237 .0949151 5.96 0.000 .3793936 .7514538
    tre_4 | .4747264 .1139863 4.16 0.000 .2513173 .6981354
    _cons | 15.22352 .037072 410.65 0.000 15.15087 15.29618
    ------------------------------------------------------------------------------

    Absorbed degrees of freedom:
    -------------------------------------------------------------+
    Absorbed FE | Categories - Redundant = Num. Coefs |
    ---------------------+---------------------------------------|
    j#hs_code_6 | 29361 29361 0 *|
    j#yrm_doc | 39480 39480 0 *|
    hs_code_6#yrm_doc | 38994 0 38994 |
    -------------------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation


    The first question is the same as the previous table: the color of min(bta) value after the Iteration 5 is becoming red, is it something wrong with my regression command or my data? (I attached a screenshot about this red part below in case)

    The second question is that my Wald chi2(4) and Prob > chi2 is missing in the table, I am not sure what is going on with it....

    Also, the third question is about me fixed effect, I notice that the hs_code_6#yrm_doc have no “*” behind, which is different from other two fixed effects, what does the * actually mean? Does that indicate I should not add this hs_code_6#yrm_doc fixed effects in the model?



    Thank you so much!
    Attached Files
Working...
X