Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • e(sample) function

    Hello everyone,
    I used the e(sample) function to check, which observations of my panel data set can be used for regression. Can anyone explain to me, why the function generate sample=e(sample) returns a "0" on the observation "PERMNO" = 90215; "YearEffective" = 2003 ? From my point of view there is no reason for not taking it into the regression.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(PERMNO YearEffective) int patents float(xrdintensity ln_emp) int AcquirorPrimarySICCode str4 curcd byte Numberofmergers float sample
    90215 2002 0  .11715776         . 7372 "USD" 0 0
    90215 2003 0  .07955571  .4173937 7372 "USD" 0 0
    90215 2004 0 .035016168  .5692832 7372 "USD" 0 1
    90215 2005 0  .05366315  .8346468 7372 "USD" 0 1
    90215 2006 0  .06710567 1.1216775 7372 "USD" 1 1
    90215 2007 1  .05856499  1.282599 7372 "USD" 0 1
    90215 2008 0  .06725809 1.5186375 7372 "USD" 0 1
    90215 2009 5  .05361228 1.6032186 7372 "USD" 0 1
    90215 2010 9  .06078194 1.8415016 7372 "USD" 0 1
    end
    Having used the "fillin" command in advance to using , generate sample=e(sample), a "1" is returned on this observation, indicating it can be used for regression.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(PERMNO YearEffective) int patents float(xrdintensity ln_emp) int AcquirorPrimarySICCode str4 curcd byte Numberofmergers float sample
    90215 2000 0          .         . 7372 ""    0 0
    90215 2001 0          .         . 7372 ""    0 0
    90215 2002 0  .11715776         . 7372 "USD" 0 0
    90215 2003 0  .07955571  .4173937 7372 "USD" 0 1
    90215 2004 0 .035016168  .5692832 7372 "USD" 0 1
    90215 2005 0  .05366315  .8346468 7372 "USD" 0 1
    90215 2006 0  .06710567 1.1216775 7372 "USD" 1 1
    90215 2007 1  .05856499  1.282599 7372 "USD" 0 1
    90215 2008 0  .06725809 1.5186375 7372 "USD" 0 1
    90215 2009 5  .05361228 1.6032186 7372 "USD" 0 1
    90215 2010 9  .06078194 1.8415016 7372 "USD" 0 1
    end
    Am I missing out on something or is this sort of a bug?

    Thank you in advance for your help



    Chris
    Last edited by Christopher Weber; 06 Dec 2021, 11:52.

  • #2
    e(sample) looks back to the last model estimated -- in a sense I shall expand on.

    You seem to be wanting to use it to look forwards. That will work in the way you want if and only if the observations to be used in your next model happen to be the same as those used in your last model, but even so, that's coincidence not prescience. e(sample) does not on the fly check your data for missings.

    The function is a bit of an odd duck because it never returns missing even when you might think it should.

    Code:
    . clear 
    
    . sysuse auto
    (1978 automobile data)
    
    . gen check = e(sample)
    
    . tab check
    
          check |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         74      100.00      100.00
    ------------+-----------------------------------
          Total |         74      100.00
    My reading: This is as much linguistic as logical in that we are all encouraged to be idiomatic Statawise and write

    Code:
    ... if e(sample)
    -- or on occasion to use its negation --

    which would be the source of puzzling if not horrible bugs if e(sample) were ever to return missing. (To recall, non-zero arguments -- thus numeric missing too -- are regarded as logically true.).

    Thus e(sample) being 0 in the example above is agnostic as well as factual, and means "there is no record in memory of these observations being used in a model fit".

    In a nutshell e(sample) returns 1 if and only if an observation was used in the last model fit, which is almost always the result that may be useful.

    Comment


    • #3
      If you ran the following code immediately after the estimation routine, then you can be sure that the observation was (not) included in the sample.

      Code:
      generate sample=e(sample)
      What we cannot tell you is why it was (not) included because we cannot see the precise command used to fit your model. Please show us exactly the model you tried to fit and the output of that model wouldn't hurt to show either.

      Comment


      • #4
        Thank you for your answeres.
        The background of using e(sample) was the following:

        I conducted one regression without the "fillin" command and it indicated usage of 881 observations.
        After that I conducted one regression with using "fillin" in advance, then it indicated usage of 882 observations.
        To check which value was not taken into account in the first case, I used e(sample) after each of the regressions above, to find the one different observation. It turned out to be the observation mentioned above. Even if the e(sample) function is a "looking back" command, there should not be any differences between the two approaches, if I understood correctly.

        I am trying to conduct a poisson regression:
        Code:
        xtpoisson patents c.xrdintensity##cl.Numberofmergers c.xrdintensity##cl2.Numberofmergers ln_emp i.YearEffective i. AcquirorPrimarySICCode
        the return tables are the following:

        Without "fillin" command:


        Code:
        Fitting Poisson model:
        
        Iteration 0:   log likelihood = -1723819.7  (not concave)
        Iteration 1:   log likelihood = -1448008.7  
        Iteration 2:   log likelihood = -1308862.9  (backed up)
        Iteration 3:   log likelihood = -813581.19  (backed up)
        Iteration 4:   log likelihood = -797918.29  (backed up)
        Iteration 5:   log likelihood = -754367.22  (backed up)
        Iteration 6:   log likelihood = -737182.98  (backed up)
        Iteration 7:   log likelihood = -531296.37  
        Iteration 8:   log likelihood =  -48261.64  
        Iteration 9:   log likelihood = -27180.276  
        Iteration 10:  log likelihood = -19897.037  
        Iteration 11:  log likelihood = -19708.139  
        Iteration 12:  log likelihood = -19706.767  
        Iteration 13:  log likelihood = -19706.767  
        
        Fitting full model:
        
        Iteration 0:   log likelihood =  -6963.042  
        Iteration 1:   log likelihood = -6389.6288  
        Iteration 2:   log likelihood = -6235.5232  
        Iteration 3:   log likelihood = -6159.9047  
        Iteration 4:   log likelihood = -6158.7126  
        Iteration 5:   log likelihood = -6158.7104  
        Iteration 6:   log likelihood = -6158.7104  
        
        Random-effects Poisson regression                   Number of obs    =     881
        Group variable: PERMNO                              Number of groups =     124
        
        Random effects u_i ~ Gamma                          Obs per group:
                                                                         min =       1
                                                                         avg =     7.1
                                                                         max =       9
        
                                                            Wald chi2(38)    = 8532.18
        Log likelihood = -6158.7104                         Prob > chi2      =  0.0000
        
        ----------------------------------------------------------------------------------------------------
                                   patents | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -----------------------------------+----------------------------------------------------------------
                              xrdintensity |  -1.120416   .1517942    -7.38   0.000    -1.417927   -.8229044
                                           |
                           Numberofmergers |
                                       L1. |  -.1026452   .0091916   -11.17   0.000    -.1206604   -.0846299
                                           |
         c.xrdintensity#cL.Numberofmergers |   1.120803    .093779    11.95   0.000     .9369991    1.304606
                                           |
                              xrdintensity |          0  (omitted)
                                           |
                           Numberofmergers |
                                       L2. |  -.0541916   .0071142    -7.62   0.000    -.0681351    -.040248
                                           |
        c.xrdintensity#cL2.Numberofmergers |   .5035829   .0660595     7.62   0.000     .3741086    .6330572
                                           |
                                    ln_emp |   1.277215   .0202322    63.13   0.000     1.237561     1.31687
                                           |
                             YearEffective |
                                     2003  |    .070271   .0138705     5.07   0.000     .0430854    .0974567
                                     2004  |   .0638179   .0138225     4.62   0.000     .0367263    .0909096
                                     2005  |   .0238406    .014322     1.66   0.096    -.0042299    .0519111
                                     2006  |   .2299319   .0147869    15.55   0.000     .2009501    .2589137
                                     2007  |   .0705921   .0153488     4.60   0.000     .0405091    .1006751
                                     2008  |   .1167685   .0145884     8.00   0.000     .0881758    .1453612
                                     2009  |   .2626573    .013854    18.96   0.000      .235504    .2898106
                                     2010  |   .2742128   .0140069    19.58   0.000     .2467598    .3016658
                                           |
                    AcquirorPrimarySICCode |
                                     2836  |   -.242717   .6880069    -0.35   0.724    -1.591186    1.105752
                                     3571  |   .0972441   1.070966     0.09   0.928     -2.00181    2.196298
                                     3572  |   .6372489   .8229056     0.77   0.439    -.9756164    2.250114
                                     3577  |    .379349   .5993693     0.63   0.527    -.7953932    1.554091
                                     3661  |   .6313768   .5439139     1.16   0.246    -.4346749    1.697428
                                     3663  |   .6232291   .8209094     0.76   0.448    -.9857237    2.232182
                                     3669  |    .280968   1.073564     0.26   0.794    -1.823179    2.385115
                                     3672  |   -5.51969    1.12849    -4.89   0.000     -7.73149   -3.307889
                                     3674  |   1.851899   .4993496     3.71   0.000     .8731915    2.830606
                                     3812  |  -.2202493   1.077741    -0.20   0.838    -2.332582    1.892084
                                     3826  |  -.0475303   1.088496    -0.04   0.965    -2.180943    2.085883
                                     3829  |  -1.737129    .827428    -2.10   0.036    -3.358858   -.1154001
                                     3841  |   .0801203   .6631733     0.12   0.904    -1.219675    1.379916
                                     3845  |    -1.2565   .8692749    -1.45   0.148    -2.960247    .4472476
                                     4812  |   .6747005   1.075412     0.63   0.530    -1.433068    2.782469
                                     4813  |  -.5512871   .8906515    -0.62   0.536    -2.296932    1.194358
                                     7371  |   .0041362   1.074446     0.00   0.997    -2.101739    2.110011
                                     7372  |  -.1760911   .4688733    -0.38   0.707    -1.095066    .7428837
                                     7373  |  -1.586968   .6820805    -2.33   0.020    -2.923821   -.2501151
                                     7374  |   .5435725   1.074358     0.51   0.613     -1.56213    2.649275
                                     7375  |  -1.113282   .8751881    -1.27   0.203    -2.828619    .6020556
                                     7376  |   -.257933   1.089939    -0.24   0.813    -2.394174    1.878308
                                     7379  |  -.3648681   1.074097    -0.34   0.734    -2.470059    1.740322
                                     8731  |   2.474682   1.070612     2.31   0.021     .3763207    4.573044
                                           |
                                     _cons |    .695324   .4478711     1.55   0.121    -.1824873    1.573135
        -----------------------------------+----------------------------------------------------------------
                                  /lnalpha |  -.0585005   .1282633                      -.309892    .1928909
        -----------------------------------+----------------------------------------------------------------
                                     alpha |   .9431777   .1209751                      .7335262    1.212751
        ----------------------------------------------------------------------------------------------------
        LR test of alpha=0: chibar2(01) = 2.7e+04              Prob >= chibar2 = 0.000




        With "fillin" command:

        Code:
        Fitting Poisson model:
        
        Iteration 0:   log likelihood = -1726119.1  (not concave)
        Iteration 1:   log likelihood = -1449940.2  
        Iteration 2:   log likelihood = -1306970.8  (backed up)
        Iteration 3:   log likelihood = -773189.47  (backed up)
        Iteration 4:   log likelihood = -757200.73  (backed up)
        Iteration 5:   log likelihood = -711824.79  (backed up)
        Iteration 6:   log likelihood = -682899.62  
        Iteration 7:   log likelihood = -497600.58  
        Iteration 8:   log likelihood = -51084.657  
        Iteration 9:   log likelihood = -30613.987  
        Iteration 10:  log likelihood = -19877.429  
        Iteration 11:  log likelihood = -19713.303  
        Iteration 12:  log likelihood = -19712.481  
        Iteration 13:  log likelihood = -19712.481  
        
        Fitting full model:
        
        Iteration 0:   log likelihood = -6964.4037  
        Iteration 1:   log likelihood = -6388.0701  
        Iteration 2:   log likelihood = -6233.6999  
        Iteration 3:   log likelihood = -6160.5029  
        Iteration 4:   log likelihood = -6159.2998  
        Iteration 5:   log likelihood = -6159.2972  
        Iteration 6:   log likelihood = -6159.2972  
        
        Random-effects Poisson regression                   Number of obs    =     882
        Group variable: PERMNO                              Number of groups =     124
        
        Random effects u_i ~ Gamma                          Obs per group:
                                                                         min =       1
                                                                         avg =     7.1
                                                                         max =       9
        
                                                            Wald chi2(38)    = 8534.53
        Log likelihood = -6159.2972                         Prob > chi2      =  0.0000
        
        ----------------------------------------------------------------------------------------------------
                                   patents | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -----------------------------------+----------------------------------------------------------------
                              xrdintensity |  -1.120653   .1517961    -7.38   0.000    -1.418168   -.8231378
                                           |
                           Numberofmergers |
                                       L1. |   -.102641   .0091917   -11.17   0.000    -.1206563   -.0846257
                                           |
         c.xrdintensity#cL.Numberofmergers |   1.120783   .0937794    11.95   0.000     .9369785    1.304587
                                           |
                              xrdintensity |          0  (omitted)
                                           |
                           Numberofmergers |
                                       L2. |  -.0542042   .0071142    -7.62   0.000    -.0681479   -.0402605
                                           |
        c.xrdintensity#cL2.Numberofmergers |   .5036948   .0660602     7.62   0.000     .3742193    .6331704
                                           |
                                    ln_emp |   1.277409   .0202313    63.14   0.000     1.237756    1.317062
                                           |
                             YearEffective |
                                     2003  |   .0702216   .0138705     5.06   0.000     .0430359    .0974073
                                     2004  |   .0638197   .0138225     4.62   0.000      .036728    .0909113
                                     2005  |   .0238386    .014322     1.66   0.096    -.0042319    .0519091
                                     2006  |   .2299213   .0147869    15.55   0.000     .2009395    .2589031
                                     2007  |   .0705804   .0153488     4.60   0.000     .0404974    .1006634
                                     2008  |   .1167531   .0145884     8.00   0.000     .0881604    .1453458
                                     2009  |   .2626372    .013854    18.96   0.000      .235484    .2897905
                                     2010  |   .2741835   .0140068    19.57   0.000     .2467306    .3016364
                                           |
                    AcquirorPrimarySICCode |
                                     2836  |  -.2427185   .6881567    -0.35   0.724    -1.591481    1.106044
                                     3571  |   .0966308   1.071207     0.09   0.928    -2.002897    2.196159
                                     3572  |   .6370681   .8230904     0.77   0.439    -.9761593    2.250296
                                     3577  |   .3789265   .5995018     0.63   0.527    -.7960753    1.553928
                                     3661  |   .6311128   .5440341     1.16   0.246    -.4351745      1.6974
                                     3663  |   .6227413   .8210935     0.76   0.448    -.9865723    2.232055
                                     3669  |   .2808093   1.073805     0.26   0.794     -1.82381    2.385429
                                     3672  |  -5.520446    1.12872    -4.89   0.000    -7.732695   -3.308196
                                     3674  |    1.85173   .4994594     3.71   0.000     .8728075    2.830652
                                     3812  |  -.2203885   1.077981    -0.20   0.838    -2.333192    1.892415
                                     3826  |  -.0476256   1.088734    -0.04   0.965    -2.181505    2.086254
                                     3829  |    -1.7375   .8276111    -2.10   0.036    -3.359588   -.1154126
                                     3841  |   .0799186   .6633207     0.12   0.904    -1.220166    1.380003
                                     3845  |  -1.256542   .8694582    -1.45   0.148    -2.960649    .4475648
                                     4812  |   .6746673   1.075653     0.63   0.531    -1.433573    2.782908
                                     4813  |  -.5512695   .8908221    -0.62   0.536    -2.297249     1.19471
                                     7371  |    .003051   1.074686     0.00   0.998    -2.103295    2.109397
                                     7372  |   -.176467   .4689757    -0.38   0.707    -1.095643    .7427085
                                     7373  |   -1.58726   .6822282    -2.33   0.020    -2.924403   -.2501176
                                     7374  |   .5434656   1.074599     0.51   0.613    -1.562709     2.64964
                                     7375  |  -1.113326   .8753653    -1.27   0.203     -2.82901    .6023589
                                     7376  |  -.2579278   1.090177    -0.24   0.813    -2.394635    1.878779
                                     7379  |  -.3651181   1.074338    -0.34   0.734    -2.470781    1.740545
                                     8731  |   2.474713   1.070854     2.31   0.021     .3758772    4.573548
                                           |
                                     _cons |   .6952956   .4479684     1.55   0.121    -.1827063    1.573298
        -----------------------------------+----------------------------------------------------------------
                                  /lnalpha |  -.0580438   .1282502                     -.3094096     .193322
        -----------------------------------+----------------------------------------------------------------
                                     alpha |   .9436086    .121018                      .7338801    1.213273
        ----------------------------------------------------------------------------------------------------
        LR test of alpha=0: chibar2(01) = 2.7e+04              Prob >= chibar2 = 0.000
        Best regards,

        Chris
        Last edited by Christopher Weber; 06 Dec 2021, 12:39.

        Comment


        • #5
          The cause has to do with your use of lag operators for number of mergers. In the first case, 2nd degree lags are not defined for Perm no. = 90215, hence a missing value and thus being dropped from estimation. -fillin- fills in whatever combination of number of mergers that were specified, resulting in a non-missing value for that Perm no. and then it became included. You should ask yourself whether you really should have data for that Perm no. for those years added by -fillin-.

          Comment

          Working...
          X