issue while running ppml_fe_bias

Tariq Masood

Join Date: Jan 2023
Posts: 27

issue while running ppml_fe_bias

02 Oct 2024, 09:12

I'm trying to correct for a three-way bias using ppml_fe_bias, but Stata is returning an error. Does anyone have any suggestions?

Code:


. ppmlhdfe trade fta_other fta_other_lead_2 fta_other_lag_4 fta_other_lag_8 fta_ind fta_ind_lead_2 fta_ind_lag_4 fta_ind_lag_8 INT
> L_BRDR_1990-INTL_BRDR_2019, absorb(exp#year imp#year exp#imp) cluster(exp#imp) d
(dropped 1265 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (1.9859e-13)
note: 2 variables omitted because of collinearity: INTL_BRDR_2018 INTL_BRDR_2019
Iteration 1:   deviance = 2.3484e+08  eps = .         iters = 10   tol = 1.0e-04  min(eta) =  -5.69  P   
Iteration 2:   deviance = 6.3188e+07  eps = 2.72e+00  iters = 9    tol = 1.0e-04  min(eta) =  -6.95      
Iteration 3:   deviance = 2.4915e+07  eps = 1.54e+00  iters = 9    tol = 1.0e-04  min(eta) =  -8.81      
Iteration 4:   deviance = 1.3979e+07  eps = 7.82e-01  iters = 9    tol = 1.0e-04  min(eta) = -10.47      
Iteration 5:   deviance = 1.0830e+07  eps = 2.91e-01  iters = 8    tol = 1.0e-04  min(eta) = -12.43      
Iteration 6:   deviance = 9.9258e+06  eps = 9.11e-02  iters = 8    tol = 1.0e-04  min(eta) = -14.56      
Iteration 7:   deviance = 9.6710e+06  eps = 2.63e-02  iters = 8    tol = 1.0e-04  min(eta) = -16.30      
Iteration 8:   deviance = 9.6003e+06  eps = 7.37e-03  iters = 6    tol = 1.0e-04  min(eta) = -18.01      
Iteration 9:   deviance = 9.5810e+06  eps = 2.01e-03  iters = 5    tol = 1.0e-04  min(eta) = -19.56      
Iteration 10:  deviance = 9.5760e+06  eps = 5.14e-04  iters = 4    tol = 1.0e-04  min(eta) = -20.41      
Iteration 11:  deviance = 9.5749e+06  eps = 1.21e-04  iters = 3    tol = 1.0e-04  min(eta) = -20.66      
Iteration 12:  deviance = 9.5746e+06  eps = 2.75e-05  iters = 2    tol = 1.0e-04  min(eta) = -22.10      
Iteration 13:  deviance = 9.5746e+06  eps = 6.35e-06  iters = 4    tol = 1.0e-05  min(eta) = -23.38      
Iteration 14:  deviance = 9.5746e+06  eps = 1.47e-06  iters = 6    tol = 1.0e-06  min(eta) = -24.38   S  
Iteration 15:  deviance = 9.5746e+06  eps = 3.25e-07  iters = 4    tol = 1.0e-06  min(eta) = -25.21   S  
Iteration 16:  deviance = 9.5746e+06  eps = 6.54e-08  iters = 5    tol = 1.0e-07  min(eta) = -25.81   S  
Iteration 17:  deviance = 9.5746e+06  eps = 1.27e-08  iters = 7    tol = 1.0e-08  min(eta) = -26.09   S  
Iteration 18:  deviance = 9.5746e+06  eps = 3.04e-09  iters = 7    tol = 1.0e-09  min(eta) = -26.56   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 18 iterations and 114 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression                              No. of obs      =     90,607
Absorbing 3 HDFE groups                           Residual df     =      3,240
Statistics robust to heteroskedasticity           Wald chi2(36)   =    2173.85
Deviance             =  9574551.129               Prob > chi2     =     0.0000
Log pseudolikelihood = -5058372.669               Pseudo R2       =     0.9983

Number of clusters (exp#imp)=      3,241
                                (Std. err. adjusted for 3,241 clusters in exp#imp)
----------------------------------------------------------------------------------
                 |               Robust
           trade | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------+----------------------------------------------------------------
       fta_other |  -.0213608   .0300176    -0.71   0.477    -.0801942    .0374727
fta_other_lead_2 |  -.0563758   .0280935    -2.01   0.045     -.111438   -.0013136
 fta_other_lag_4 |  -.0003446   .0293664    -0.01   0.991    -.0579017    .0572124
 fta_other_lag_8 |  -.0427452    .028236    -1.51   0.130    -.0980868    .0125964
         fta_ind |   .1328642   .0495224     2.68   0.007     .0358021    .2299263
  fta_ind_lead_2 |   .1780139   .0712708     2.50   0.012     .0383257    .3177021
   fta_ind_lag_4 |   .0335462   .0362756     0.92   0.355    -.0375527    .1046452
   fta_ind_lag_8 |   .0000193   .0639673     0.00   1.000    -.1253544    .1253929
  INTL_BRDR_1990 |  -.3725335    .046553    -8.00   0.000    -.4637756   -.2812914
  INTL_BRDR_1991 |  -.2927797    .043997    -6.65   0.000    -.3790123   -.2065471
  INTL_BRDR_1992 |  -.2812192   .0404477    -6.95   0.000    -.3604953   -.2019431
  INTL_BRDR_1993 |  -.3143837   .0416806    -7.54   0.000    -.3960762   -.2326912
  INTL_BRDR_1994 |  -.2053001   .0434917    -4.72   0.000    -.2905422   -.1200579
  INTL_BRDR_1995 |  -.1645783   .0408916    -4.02   0.000    -.2447244   -.0844322
  INTL_BRDR_1996 |  -.1396069    .040277    -3.47   0.001    -.2185483   -.0606655
  INTL_BRDR_1997 |  -.0556948    .038501    -1.45   0.148    -.1311553    .0197657
  INTL_BRDR_1998 |   .0029684   .0353831     0.08   0.933    -.0663812    .0723181
  INTL_BRDR_1999 |   .0026243   .0356877     0.07   0.941    -.0673224    .0725709
  INTL_BRDR_2000 |   .0611137   .0338987     1.80   0.071    -.0053265     .127554
  INTL_BRDR_2001 |   .0723655   .0315621     2.29   0.022     .0105049    .1342262
  INTL_BRDR_2002 |   .0657136   .0341782     1.92   0.055    -.0012746    .1327017
  INTL_BRDR_2003 |   .0472234   .0352788     1.34   0.181    -.0219218    .1163687
  INTL_BRDR_2004 |   .1105477    .034818     3.18   0.001     .0423057    .1787896
  INTL_BRDR_2005 |   .1257476   .0333083     3.78   0.000     .0604646    .1910307
  INTL_BRDR_2006 |    .174981   .0307949     5.68   0.000      .114624    .2353379
  INTL_BRDR_2007 |   .1439483   .0314064     4.58   0.000     .0823928    .2055038
  INTL_BRDR_2008 |   .1683745   .0268885     6.26   0.000     .1156741    .2210749
  INTL_BRDR_2009 |   .0913729    .025577     3.57   0.000     .0412429    .1415029
  INTL_BRDR_2010 |   .1541567    .022835     6.75   0.000     .1094009    .1989125
  INTL_BRDR_2011 |   .1699547   .0209745     8.10   0.000     .1288454    .2110639
  INTL_BRDR_2012 |    .150191   .0177981     8.44   0.000     .1153074    .1850746
  INTL_BRDR_2013 |   .1398731   .0184009     7.60   0.000     .1038081    .1759382
  INTL_BRDR_2014 |    .166062   .0168755     9.84   0.000     .1329867    .1991373
  INTL_BRDR_2015 |   .2129488   .0169653    12.55   0.000     .1796974    .2462002
  INTL_BRDR_2016 |   .1830368   .0160971    11.37   0.000      .151487    .2145865
  INTL_BRDR_2017 |  -.0232885   .0069975    -3.33   0.001    -.0370034   -.0095737
  INTL_BRDR_2018 |          0  (omitted)
  INTL_BRDR_2019 |          0  (omitted)
           _cons |   13.15552   .0107335  1225.66   0.000     13.13448    13.17656
----------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
    exp#year |      1632           1        1631     |
    imp#year |      1613          29        1584     |
     exp#imp |      3241        3241           0    *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

.                 
. estimates store pre_bias_correct

. 
. * Create conditional mean (lambda) and a matrix of coefficient estimates (beta)
. predict lambda
(option mu assumed; predicted mean of depvar)
(8,503 missing values generated)

. matrix beta = e(b)

. ppml_fe_bias trade fta_other fta_other_lead_2 fta_other_lag_4 fta_other_lag_8 fta_ind fta_ind_lead_2 fta_ind_lag_4 fta_ind_lag_8
>  INTL_BRDR_1990-INTL_BRDR_2019, i(exp) j(imp) t(year) lambda(lambda) beta(beta)
performance warning: -by- prefix may be slower than -by()-
performance warning: -by- prefix may be slower than -by()-
performance warning: -by- prefix may be slower than -by()-
note: because of the size of the data, an approximation will be used to compute the adjusted variance. Use the -exact- option if y
> ou wish to compute the variance exactly.
The set of x variables (fta_other fta_other_lead_2 fta_other_lag_4 fta_other_lag_8 fta_ind fta_ind_lead_2 fta_ind_lag_4 fta_ind_la
> g_8 INTL_BRDR_1990 INTL_BRDR_1991 INTL_BRDR_1992 INTL_BRDR_1993 INTL_BRDR_1994 INTL_BRDR_1995 INTL_BRDR_1996 INTL_BRDR_1997 INTL
> _BRDR_1998 INTL_BRDR_1999 INTL_BRDR_2000 INTL_BRDR_2001 INTL_BRDR_2002 INTL_BRDR_2003 INTL_BRDR_2004 INTL_BRDR_2005 INTL_BRDR_20
> 06 INTL_BRDR_2007 INTL_BRDR_2008 INTL_BRDR_2009 INTL_BRDR_2010 INTL_BRDR_2011 INTL_BRDR_2012 INTL_BRDR_2013 INTL_BRDR_2014 INTL_
> BRDR_2015 INTL_BRDR_2016 INTL_BRDR_2017 INTL_BRDR_2018 INTL_BRDR_2019) does not appear to be of full rank after conditioning on 
> the fixed effects.
r(111);

end of do-file

r(111);

Tags: None

Joao Santos Silva

Join Date: Apr 2014

Posts: 2962
#2

03 Oct 2024, 00:21

Dear Tariq Masood,

I guess the problem is exactly the one stated in the error message: you have perfect multicollinearity. Try to estimate the model absorbing the INTL variables, or do what you did without including INTL_BRDR_2018 and INTL_BRDR_2019.

Best wishes,

Joao
Comment
Tariq Masood

Join Date: Jan 2023

Posts: 27
#3

03 Oct 2024, 12:02

Thanks, Joao Santos Silva. It works well when we drop collinear variables. I need one more clarification; after running ppml_fe_bias, I am using the post-estimation command lincom for adding current, lead, and lag coefficients of the fta variable; results seem a bit odd as, despite the significance of all individual coefficients, the sum of coefficient given by lincom is not significant in most of the cases. I am not sure whether we can use lincom after ppml_fe_bias, nothing is mentioned in the help file either.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2962
#4

03 Oct 2024, 12:13

Please show us the results.
Comment

Tariq Masood

Join Date: Jan 2023
Posts: 27

03 Oct 2024, 13:00

Dear Joao Santos Silva, here it is:

Code:

. ppmlhdfe trade fta_other fta_other_lead_2 fta_other_lag_4 fta_other_lag_8 fta_ind fta_ind_lead_2 fta_ind_lag_4 fta_ind_lag_8, ab
> sorb(exp#year imp#year exp#imp) cluster(exp#imp) d
(dropped 1265 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (1.9859e-13)
Iteration 1:   deviance = 2.3728e+08  eps = .         iters = 10   tol = 1.0e-04  min(eta) =  -5.71  P   
Iteration 2:   deviance = 6.4288e+07  eps = 2.69e+00  iters = 9    tol = 1.0e-04  min(eta) =  -6.89      
Iteration 3:   deviance = 2.6303e+07  eps = 1.44e+00  iters = 9    tol = 1.0e-04  min(eta) =  -8.70      
Iteration 4:   deviance = 1.5521e+07  eps = 6.95e-01  iters = 8    tol = 1.0e-04  min(eta) = -10.40      
Iteration 5:   deviance = 1.2403e+07  eps = 2.51e-01  iters = 8    tol = 1.0e-04  min(eta) = -12.37      
Iteration 6:   deviance = 1.1506e+07  eps = 7.79e-02  iters = 8    tol = 1.0e-04  min(eta) = -13.90      
Iteration 7:   deviance = 1.1254e+07  eps = 2.24e-02  iters = 7    tol = 1.0e-04  min(eta) = -15.56      
Iteration 8:   deviance = 1.1183e+07  eps = 6.27e-03  iters = 6    tol = 1.0e-04  min(eta) = -17.76      
Iteration 9:   deviance = 1.1164e+07  eps = 1.71e-03  iters = 5    tol = 1.0e-04  min(eta) = -19.32      
Iteration 10:  deviance = 1.1159e+07  eps = 4.36e-04  iters = 4    tol = 1.0e-04  min(eta) = -20.17      
Iteration 11:  deviance = 1.1158e+07  eps = 1.02e-04  iters = 3    tol = 1.0e-04  min(eta) = -20.59      
Iteration 12:  deviance = 1.1158e+07  eps = 2.34e-05  iters = 2    tol = 1.0e-04  min(eta) = -22.20      
Iteration 13:  deviance = 1.1158e+07  eps = 5.39e-06  iters = 3    tol = 1.0e-05  min(eta) = -23.48   S  
Iteration 14:  deviance = 1.1158e+07  eps = 1.24e-06  iters = 7    tol = 1.0e-06  min(eta) = -24.48   S  
Iteration 15:  deviance = 1.1158e+07  eps = 2.76e-07  iters = 3    tol = 1.0e-06  min(eta) = -25.30   S  
Iteration 16:  deviance = 1.1158e+07  eps = 5.53e-08  iters = 6    tol = 1.0e-07  min(eta) = -25.91   S  
Iteration 17:  deviance = 1.1158e+07  eps = 1.08e-08  iters = 6    tol = 1.0e-08  min(eta) = -26.18   S  
Iteration 18:  deviance = 1.1158e+07  eps = 2.57e-09  iters = 5    tol = 1.0e-09  min(eta) = -26.36   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 18 iterations and 109 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression                              No. of obs      =     90,607
Absorbing 3 HDFE groups                           Residual df     =      3,240
Statistics robust to heteroskedasticity           Wald chi2(8)    =      39.11
Deviance             =   11157973.5               Prob > chi2     =     0.0000
Log pseudolikelihood = -5850083.854               Pseudo R2       =     0.9980

Number of clusters (exp#imp)=      3,241
                                (Std. err. adjusted for 3,241 clusters in exp#imp)
----------------------------------------------------------------------------------
                 |               Robust
           trade | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------+----------------------------------------------------------------
       fta_other |    -.04307   .0366114    -1.18   0.239     -.114827    .0286869
fta_other_lead_2 |   .0019637    .029959     0.07   0.948    -.0567549    .0606823
 fta_other_lag_4 |   .0632791   .0325295     1.95   0.052    -.0004775    .1270358
 fta_other_lag_8 |   .0738747   .0322538     2.29   0.022     .0106585     .137091
         fta_ind |   .1546749   .0486236     3.18   0.001     .0593743    .2499755
  fta_ind_lead_2 |   .2705196   .0723929     3.74   0.000     .1286322     .412407
   fta_ind_lag_4 |   .0219007   .0363597     0.60   0.547    -.0493631    .0931644
   fta_ind_lag_8 |  -.0358542   .0674295    -0.53   0.595    -.1680136    .0963053
           _cons |   13.15111   .0085224  1543.13   0.000      13.1344    13.16781
----------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
    exp#year |      1632           1        1631     |
    imp#year |      1613          29        1584     |
     exp#imp |      3241        3241           0    *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

. 
. lincom fta_other + fta_other_lead_2 + fta_other_lag_4 + fta_other_lag_8 

 ( 1)  fta_other + fta_other_lead_2 + fta_other_lag_4 + fta_other_lag_8 = 0

------------------------------------------------------------------------------
       trade | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .0960475   .0769666     1.25   0.212    -.0548042    .2468992
------------------------------------------------------------------------------

. lincom fta_ind + fta_ind_lead_2 + fta_ind_lag_4 + fta_ind_lag_8

 ( 1)  fta_ind + fta_ind_lead_2 + fta_ind_lag_4 + fta_ind_lag_8 = 0

------------------------------------------------------------------------------
       trade | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .4112411   .1064129     3.86   0.000     .2026757    .6198065
------------------------------------------------------------------------------

. 
. predict lambda
(option mu assumed; predicted mean of depvar)
(8,503 missing values generated)

. matrix beta = e(b)

. ppml_fe_bias trade fta_other fta_other_lead_2 fta_other_lag_4 fta_other_lag_8 fta_ind fta_ind_lead_2 fta_ind_lag_4 fta_ind_lag_8
> , i(exp) j(imp) t(year) lambda(lambda) beta(beta)
performance warning: -by- prefix may be slower than -by()-
performance warning: -by- prefix may be slower than -by()-
performance warning: -by- prefix may be slower than -by()-
note: because of the size of the data, an approximation will be used to compute the adjusted variance. Use the -exact- option if y
> ou wish to compute the variance exactly.
  Adjusted SEs
                 1
    +---------------+
  1 |  .0530123984  |
  2 |  .0391335799  |
  3 |  .0437273482  |
  4 |   .067578189  |
  5 |  .0598139399  |
  6 |  .2155671519  |
  7 |  .0633912212  |
  8 |  .0972029327  |
    +---------------+
  bias corrections (to be subtracted from original coefficients)
                  1
    +----------------+
  1 |   .0140427486  |
  2 |   .0013384979  |
  3 |  -.0182670307  |
  4 |  -.0305944349  |
  5 |  -.0017081691  |
  6 |  -.0465463329  |
  7 |   .0101338759  |
  8 |   .0196714122  |
    +----------------+
note: beta matrix will be shortened to the same length as the number of x-variables

                           ---------------------------------------------------------------------------
                                                original       bias     adjusted SEs  bias-corrected 
                           ---------------------------------------------------------------------------
                            fta_other          -0.0430700   0.0140427    0.0530124      -0.0571128   
                                               (0.0366114)                             (0.0530124)   
                            fta_other_lead_2    0.0019637   0.0013385    0.0391336      0.0006252    
                                               (0.0299590)                             (0.0391336)   
                            fta_other_lag_4     0.0632791   -0.0182670   0.0437273      0.0815462    
                                               (0.0325295)                             (0.0437273)*  
                            fta_other_lag_8     0.0738747   -0.0305944   0.0675782      0.1044692    
                                               (0.0322538)                             (0.0675782)   
                            fta_ind             0.1546749   -0.0017082   0.0598139      0.1563831    
                                               (0.0486236)                            (0.0598139)*** 
                            fta_ind_lead_2      0.2705196   -0.0465463   0.2155672      0.3170659    
                                               (0.0723929)                             (0.2155672)   
                            fta_ind_lag_4       0.0219007   0.0101339    0.0633912      0.0117668    
                                               (0.0363597)                             (0.0633912)   
                            fta_ind_lag_8      -0.0358542   0.0196714    0.0972029      -0.0555256   
                                               (0.0674295)                             (0.0972029)   
                           ---------------------------------------------------------------------------
                              Standard errors clustered by pair, using a local de-biasing adjustment
                            to account for estimation noise in the exp-year and imp-year fixed effects.
                                                 * p<0.10; ** p<0.05; *** p<0.01


. 
. lincom fta_other + fta_other_lead_2 + fta_other_lag_4 + fta_other_lag_8 

 ( 1)  fta_other + fta_other_lead_2 + fta_other_lag_4 + fta_other_lag_8 = 0

------------------------------------------------------------------------------
       trade | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .1295277   .1135525     1.14   0.254     -.093031    .3520865
------------------------------------------------------------------------------

. lincom fta_ind + fta_ind_lead_2 + fta_ind_lag_4 + fta_ind_lag_8

 ( 1)  fta_ind + fta_ind_lead_2 + fta_ind_lag_4 + fta_ind_lag_8 = 0

------------------------------------------------------------------------------
       trade | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .4296903   .3628808     1.18   0.236    -.2815431    1.140924
------------------------------------------------------------------------------

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 2962
#6

04 Oct 2024, 00:16

Dear Tariq Masood,

The results look fine to me. I also did a test with my own data and did not find any problem.

Best wishes,

Joao
Comment
Priscah Kyalo

Join Date: Oct 2024

Posts: 5
#7

15 Oct 2024, 09:53

Originally posted by Joao Santos Silva View Post

Dear Tariq Masood,

The results look fine to me. I also did a test with my own data and did not find any problem.

Best wishes,

Joao

Hello @Joao Santos Silva

Probably you can also assist me here. I'm looking at the drivers of foreign banks to the host country, the dependent variable being the number of banks moving to the host country year on year, using the command;

ppmlhdfe NoofBanks log_FDI FID_IX Boone Lerner rescaled_TRA rescaled_INF FIE_IX log_GDP IQ_Index, vce(robust) absorb(id)

My issue is the number of observations dropped due to single observation, which is more than 50% of the observations as shown below. In this case, only two variables are significant.
(dropped 143 observations that are either singletons or separated by a fixed effect).

However, when I use the cumulative count of the number of banks year on year as the dependent, no observation is lost, and at least 6 variables are significant.

My question is, when examining the drivers of the foreign banks to the host country, is it Okay to use the cumulative count? I thought the number year on year would be the appropriate dependent variable for such an objective. Or is there a different Poisson Model I can use to model this appropriately? I've read several threads here discouraging use of Zero-inflated Model, not sure what would be best for me here.

Kindly guide me.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2962
#8

15 Oct 2024, 13:10

Dear Priscah Kyalo,

Modelling the stock or modelling the flow are two very different things and only you can know which one you want. For the flows, I do not see a reason to assume that the conditional expected value is positive, so PPML may not be suitable. It will be suitable for the stock, which is a non-negative variable.

Anyway, when you use PPML, you should not worry about the number of observations that are dropped.

Best wishes,

Joao
Comment
Priscah Kyalo

Join Date: Oct 2024

Posts: 5
#9

15 Oct 2024, 15:34

Originally posted by Joao Santos Silva View Post

Dear Priscah Kyalo,

Modelling the stock or modelling the flow are two very different things and only you can know which one you want. For the flows, I do not see a reason to assume that the conditional expected value is positive, so PPML may not be suitable. It will be suitable for the stock, which is a non-negative variable.

Anyway, when you use PPML, you should not worry about the number of observations that are dropped.

Best wishes,

Joao

Thank you @Joao Santos Silva for your response. Just some clarification. I have assumed that the conditional expected value is positive because I'm concerned with one sided move, that is the number of bank moving from country A to country B, and not vice-versa. Therefore, the number can only be positive, or zero. I also settled for PPML because my data is overdispersed. I tried xtpoisson and all the variables are insignificant, and the whole model is insignificant. Negative binomial did not converge, which confirms Jeff Wooldridge's several comments in various threads on why we should stay away from negative binomial. The only command which worked was the PPML. You mention that I should not worry about the number of observations that dropped, I'm thinking the dropped observations may also affect the final results because the sample has reduced by a huge margin (almost 50%), which might affect the robustness of my results, generalizability, as well as introduce some bias in my results. Your thoughts on my views and fears?
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2962
#10

15 Oct 2024, 16:13

Dear Priscah Kyalo,

The observations that are dropped are not informative about the parameters you are trying to estimate, so dropping them makes no difference. If you are only looking at one-side moves, then PPML sounds reasonable with or without overdispersion. However, ppmlhdfe absorbing id should give you exactly the same results as xtpoisson with the fe option.

Best wishes,

Joao
Comment
Priscah Kyalo

Join Date: Oct 2024

Posts: 5
#11

16 Oct 2024, 02:08

Originally posted by Joao Santos Silva View Post

Dear Priscah Kyalo,

The observations that are dropped are not informative about the parameters you are trying to estimate, so dropping them makes no difference. If you are only looking at one-side moves, then PPML sounds reasonable with or without overdispersion. However, ppmlhdfe absorbing id should give you exactly the same results as xtpoisson with the fe option.

Best wishes,

Joao

Many thanks @Joao Santos Silva. for taking your time to review my query and offer guidance. I will proceed with the flow option, ignore the dropped obs. and probably have a second model estimate for robustness check. Again, thank you.
1 like
Comment

Announcement

issue while running ppml_fe_bias

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment