Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Here is the result :
    Code:
     boottest immi_sh
    
    Wild bootstrap-t, null imposed, 999 replications, Wald test, Rademacher weights:
      immi_sh
    
                          t(1514559) =   -17.9428
                            Prob>|t| =     0.0000
    
    95% confidence set for null hypothesis expression: [−.1136, −.09001]
    Click image for larger version

Name:	boot.png
Views:	1
Size:	92.7 KB
ID:	1752889

    Code:
     reghdfe ln_labor_productivity immi_sh share_9 share_12 share_uni  logsize lavg_firm_age lage, absorb(sector region year) cluster
    > (sector region)
    (MWFE estimator converged in 5 iterations)
    Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
    warning: missing F statistic; dropped variables due to collinearity or too few clusters
    
    HDFE Linear regression                            Number of obs   =  1,514,590
    Absorbing 3 HDFE groups                           F(   7,      7) =          .
    Statistics robust to heteroskedasticity           Prob > F        =          .
                                                      R-squared       =     0.1485
                                                      Adj R-squared   =     0.1484
    Number of clusters (sector)  =          8         Within R-sq.    =     0.0866
    Number of clusters (region)  =          8         Root MSE        =     0.8528
    
                               (Std. Err. adjusted for 8 clusters in sector region)
    -------------------------------------------------------------------------------
                  |               Robust
    ln_labor_pr~y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
          immi_sh |  -.1050354   .0567003    -1.85   0.106    -.2391103    .0290395
          share_9 |   .2070373   .0226414     9.14   0.000     .1534988    .2605757
         share_12 |   .4663512    .053889     8.65   0.000      .338924    .5937783
        share_uni |   .8243444    .096452     8.55   0.000     .5962717    1.052417
          logsize |   .1792899   .0254396     7.05   0.000     .1191349     .239445
    lavg_firm_age |   .0631174   .0113083     5.58   0.001     .0363775    .0898574
             lage |   .1139351   .0636955     1.79   0.117    -.0366809    .2645512
            _cons |   8.407244   .2552188    32.94   0.000     7.803747    9.010741
    -------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
          sector |         8           8           0    *|
          region |         8           8           0    *|
            year |        10           1           9     |
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    Sector and region do not produce significant results.
    Last edited by Paris Rira; 09 May 2024, 20:43.

    Comment


    • #17
      I keep forgetting you can't boottest after reghdfe. and you can't use xtreg since your ID is firm. You'll have to areg, absorb(sector), and include i.region i.year as regressors.

      Have you tried to collapse to sector?






      Comment


      • #18
        Originally posted by George Ford View Post
        I keep forgetting you can't boottest after reghdfe.
        Thats correct. I did after regression not after reghfe.

        Originally posted by George Ford View Post
        and you can't use xtreg since your ID is firm.
        If I generate g id= _n as a panelid to use xtreg , first is it correct? second, even its correct the coffiencet became insignificant ( you know the nightmare of all students).
        clusterng in sector is the same. no significanct result.

        I believe that my case study is a not big economy (Portugal) .So dropping clustering and only "reghfe a (region sector year) vce(robust)" would be sufficient.










        Comment


        • #19
          say you have firms that appear repeatedly, called firmname.

          why g_id = _n is just a running series of your observations.

          egen id = group(firmname)

          Comment


          • #20
            Prof Ford, I run xtreg without fe though. Because when I apply id, id omitted because of collinearity.
            Code:
             egen id = group(NPC_FIC)
            (1 missing value generated)
            
            . 
            end of do-file
            
            . do "C:\Users\CeBER\AppData\Local\Temp\STD454c_000000.tmp"
            
            . xtreg ln_labor_productivity immi_sh share_9 share_12 share_uni  logsize lavg_firm_age lage i.sector i. region i.year id,vce (robust)
            (1 missing value generated)
            
            Random-effects GLS regression                   Number of obs     =  1,514,590
            Group variable: NPC_FIC                         Number of groups  =    288,156
            
            R-sq:                                           Obs per group:
                 within  = 0.0013                                         min =          1
                 between = 0.1346                                         avg =        5.3
                 overall = 0.1198                                         max =         10
            
                                                            Wald chi2(31)     =   36073.73
            corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
            
                                       (Std. Err. adjusted for 288,156 clusters in NPC_FIC)
            -------------------------------------------------------------------------------
                          |               Robust
            ln_labor_pr~y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            --------------+----------------------------------------------------------------
                  immi_sh |  -.0650435   .0084966    -7.66   0.000    -.0816966   -.0483904
                  share_9 |    .135753    .005852    23.20   0.000     .1242832    .1472228
                 share_12 |   .2311707     .00658    35.13   0.000     .2182742    .2440673
                share_uni |   .3962222   .0079694    49.72   0.000     .3806024     .411842
                  logsize |    .058803   .0015836    37.13   0.000     .0556993    .0619068
            lavg_firm_age |   .0863299   .0022868    37.75   0.000     .0818478     .090812
                     lage |   .0510312   .0077095     6.62   0.000     .0359209    .0661414
                          |
                   sector |
                       6  |   .3647818   .0051526    70.80   0.000     .3546829    .3748807
                       7  |   .0198841    .004658     4.27   0.000     .0107545    .0290137
                       9  |  -.4893518   .0058703   -83.36   0.000    -.5008574   -.4778462
                      10  |   .0453041   .0115703     3.92   0.000     .0226266    .0679815
                      11  |   .1865925   .0111949    16.67   0.000     .1646508    .2085342
                      12  |   .1199651   .0066864    17.94   0.000     .1068599    .1330703
                      13  |   .0263971   .0087363     3.02   0.003     .0092743    .0435199
                          |
                   region |
                       2  |   .0695653   .0040251    17.28   0.000     .0616763    .0774543
                       3  |   .1442682   .0042174    34.21   0.000     .1360022    .1525342
                       4  |   .0527479   .0076499     6.90   0.000     .0377544    .0677415
                       5  |   .0438303   .0069713     6.29   0.000     .0301668    .0574938
                       6  |   .0720837   .0123662     5.83   0.000     .0478465    .0963209
                       7  |  -.0012828   .0119015    -0.11   0.914    -.0246093    .0220437
                       8  |   .9237366   .4257328     2.17   0.030     .0893157    1.758157
                          |
                     year |
                    2011  |   -.061416   .0018981   -32.36   0.000    -.0651363   -.0576957
                    2012  |  -.1022061   .0022529   -45.37   0.000    -.1066217   -.0977905
                    2013  |  -.0759926   .0024264   -31.32   0.000    -.0807482    -.071237
                    2014  |  -.0881164    .002547   -34.60   0.000    -.0931085   -.0831244
                    2015  |   -.081666    .002628   -31.08   0.000    -.0868168   -.0765153
                    2016  |  -.0781794   .0027474   -28.46   0.000    -.0835643   -.0727945
                    2017  |  -.0899256    .002883   -31.19   0.000    -.0955762   -.0842751
                    2018  |  -.1217728   .0030893   -39.42   0.000    -.1278277    -.115718
                    2019  |  -.1479554   .0033137   -44.65   0.000    -.1544501   -.1414607
                          |
                       id |   7.04e-08   7.14e-09     9.86   0.000     5.64e-08    8.44e-08
                    _cons |    8.78296   .0319378   275.00   0.000     8.720363    8.845557
            --------------+----------------------------------------------------------------
                  sigma_u |  .77669793
                  sigma_e |  .54773912
                      rho |  .66785618   (fraction of variance due to u_i)
            -------------------------------------------------------------------------------
            
            . xtreg ln_labor_productivity immi_sh share_9 share_12 share_uni  logsize lavg_firm_age lage i.sector i. region i.year i.id,fe vce (rob
            > ust)
            maxvar too small
                You have attempted to use an interaction with too many levels or attempted to fit a model with too many variables.  You need to
                increase maxvar; it is currently 5000.  Use set maxvar; see help maxvar.
            
                If you are using factor variables and included an interaction that has lots of missing cells, try set emptycells drop to reduce
                the required matrix size; see help set emptycells.
            
                If you are using factor variables, you might have accidentally treated a continuous variable as a categorical, resulting in lots
                of categories.  Use the c. operator on such variables.
            r(907);
            
            end of do-file
            
            r(907);
            Moreover, the manual pdf for ivreghdfe has been published? I could not find its pdf on the internet. I need to reference in my article this syntax.

            Comment


            • #21
              you have id in the regression.

              Try this:

              egen id = group(firmname)
              reghdfe ln_labor_productivity immi_sh share_9 share_12 share_uni logsize lavg_firm_age lage , absorb(id sector region year) vce(robust)

              I suspect sector may wash out due to the inclusion of id, but it should estimate

              Comment


              • #22
                Just because you want to include sector fixed effects does not mean you need to cluster by sector. As George said, 8 sectors is not enough to do much of anything. The key is: at what level is your key variable, immigrant share, or the instrumental variable, varying? Is immigrant share varying at the firm level? If so, then clustering at the firm level is probably sufficient. The real issue is whether you can get away with sector fixed effects or do you need firm FEs. You can do the Mundlak regression and test whether the time averages at the firm level are significant, using a cluster-robust test.

                Comment


                • #23
                  Originally posted by Jeff Wooldridge View Post
                  Is immigrant share varying at the firm level? .
                  Dear Prof Thank you for getting back to me. Yes it is at firm level.
                  Do I need firm FEs, is a good question indeed. My assumption is that since firms donot move across sector/ region over time , so probably sector and region fixed effects will capture the effects (No need to firm FEs).

                  Code:
                   mundlak  ln_labor_productivity immi_sh share_9 share_12 share_uni  logsize lavg_firm_age lage sector region year
                  
                  The variable region does not vary sufficiently within groups and will not be used to create additional regressors.
                  0% of the total variance in region is within groups.
                  
                  +------------------------------------------------+
                  |             Variable |     RE     |  Mundlak   |
                  |----------------------+------------+------------|
                  |              immi_sh |     -0.075 |     -0.006 |
                  |              share_9 |      0.130 |      0.070 |
                  |             share_12 |      0.245 |      0.078 |
                  |            share_uni |      0.465 |      0.086 |
                  |              logsize |      0.051 |     -0.066 |
                  |        lavg_firm_age |      0.070 |      0.094 |
                  |                 lage |      0.097 |      0.010 |
                  |               sector |     -0.010 |     -0.005 |
                  |               region |      0.010 |      0.013 |
                  |                 year |     -0.010 |     -0.012 |
                  |        mean__immi_sh |            |     -0.148 |
                  |        mean__share_9 |            |      0.148 |
                  |       mean__share_12 |            |      0.416 |
                  |      mean__share_uni |            |      0.858 |
                  |        mean__logsize |            |      0.252 |
                  |  mean__lavg_firm_age |            |      0.014 |
                  |           mean__lage |            |      0.283 |
                  |         mean__sector |            |     -0.015 |
                  |           mean__year |            |      0.052 |
                  |                _cons |     29.020 |    -72.032 |
                  |----------------------+------------+------------|
                  |                    N |    1514590 |    1514590 |
                  |                  N_g | 288156.000 | 288156.000 |
                  |                g_min |      1.000 |      1.000 |
                  |                g_avg |      5.256 |      5.256 |
                  |                g_max |     10.000 |     10.000 |
                  |                  rho |      0.689 |      0.689 |
                  |                 rmse |      0.551 |      0.546 |
                  |                 chi2 |  14815.116 |  42487.877 |
                  |                    p |      0.000 |      0.000 |
                  |                 df_m |     10.000 |     19.000 |
                  |                sigma |      0.984 |      0.984 |
                  |              sigma_u |      0.817 |      0.817 |
                  |              sigma_e |      0.549 |      0.549 |
                  |                 r2_w |      0.000 |      0.003 |
                  |                 r2_o |      0.081 |      0.111 |
                  |                 r2_b |      0.079 |      0.116 |
                  +------------------------------------------------+

                  Comment


                  • #24
                    Originally posted by Jeff Wooldridge View Post
                    You can do the Mundlak regression and test whether the time averages at the firm level are significant, using a cluster-robust test.
                    Prof, I dont understand this part.
                    What exactly should I do based on the result of Mundlak above? Moreover, in theory, firm FEs should be in the model to control for unobserved heterogeneity at the firm level. I dont add it mainly because makes the result insignificant (I know it is not a firm reason). So I try to find shortcuts either to defend my choice or add firm FEs in a way that does not spoil the significancy.

                    Comment


                    • #25
                      That command is not very useful if it doesn’t provide standard errors. Also, you need i.sector, i.region, and i.year. After including all controls properly, you do a test on the averages. See my 2023 paper with Papke in Empirical Economics.

                      Comment


                      • #26
                        Prof Jeff, I really like your paper. "When analyzing firm-level panel data, removing unobserved heterogeneity at a higher level of aggregation might suffice. In situations where firms are nested within sectors, addressing sector-level heterogeneity could adequately ensure the exogeneity of explanatory variables Papke and Wooldridge (2023)"

                        I believe that by referencing this paper I could be able to not incorporate firm fixed effects as the aggregated level (sector) does the job.

                        Comment


                        • #27
                          It is intended for these situations, but to push the idea that aggregated FEs are enough, you really should do the test. It's not so hard. See here:

                          https://www.dropbox.com/sh/g5okcahdj...mWg5joUba?dl=0

                          Comment


                          • #28
                            My situation is like this. I have firm-level data which are categorized into 8 sectors and 8 regions. Aggregated level, the sector might absorb the heterogeneity, please correct me if I am wrong.

                            Thanks for sharing the files. What exactly I should do in terms of my own data based on these files? meap94_98 or simulation_20221011 or simulation_power_20221011, which one is supposed to do the test? I am not professional, Prof

                            Comment


                            • #29
                              Could you please assist me in concluding this thread? Based on this result, may I draw a conclusion that sector fixed effect is sufficient? Appreciated.

                              My data analysis is firm level, assessing the impact of immigrant on labour productivity, but I would like to use only sector-fixed effect. To this end, I run Mundlak (Mundlak, Y. (1978). On the pooling of time series and cross-section data. Econometrica, 46, 69-85). Here is the result.
                              Code:
                              bysort sector: egen mean__immi_sh = mean(immi_sh)
                              bysort sector: egen mean__share_9 = mean( share_9)
                              bysort sector: egen mean__share_12= mean(share_uni)
                              bysort sector: egen mean__share_uni = mean(logsize)
                              bysort sector: egen mean__logsize = mean(lavg_firm_age)
                              bysort sector: egen mean__lavg_firm_age= mean(lavg_firm_age)
                              bysort sector: egen mean__lage  = mean(lage)
                              bysort sector: egen  mean__year = mean(year)
                              
                              xtset sector
                              
                              qui xtreg ln_labor_productivity immi_sh share_9 share_12 share_uni  logsize lavg_firm_age lage i.year i.region mean__immi_sh mean__share_9  mean__share_12 mean__share_uni  mean__logsize    mean__lavg_firm_age mean__lage mean__year, vce(cluster sector)
                              estimates store mundlak
                               test mean__immi_sh mean__share_9  mean__share_12 mean__share_uni  mean__logsize mean__lavg_firm_age mean__lage mean__year
                              
                               ( 1)  mean__immi_sh = 0
                               ( 2)  mean__share_9 = 0
                               ( 3)  mean__share_12 = 0
                               ( 4)  mean__share_uni = 0
                               ( 5)  mean__logsize = 0
                               ( 6)  o.mean__lavg_firm_age = 0
                               ( 7)  mean__lage = 0
                               ( 8)  o.mean__year = 0
                                     Constraint 6 dropped
                                     Constraint 8 dropped
                              
                                         chi2(  6) = 1713.65
                                       Prob > chi2 =    0.0000

                              Comment


                              • #30
                                You haven’t implemented it properly. You include the sector and region dummies, and probably interact the sector and region. The averages are computed by firm, not sector.

                                Comment

                                Working...
                                X