Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by Jeff Wooldridge View Post
    Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

    The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

    Code:
    xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe
    predict double v2h_fe, e
    xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)
    The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

    Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

    Once people like you start to use this, it will really catch on. ;-)

    link_to_paper

    Dear Jeff Wooldridge

    Thanks for sharing those sound tips. Only one short question:
    How should I interpret the sign (+/-) and the (non)singnificance of the residuals (v2h_fe, in your example) in the second regression?

    Thanks in advance,

    Kind regards,

    Álvaro Zarzoso.

    Comment


    • #32
      Álvaro: What it means is that allowing for spending to be correlated with heterogeneity appears to be enough in this application. In other words, there seems to be no remaining endogeneity (with respect to the idiosyncratic errors) once the Chamberlain-Mundlak device is used for the lavgexp variable.

      JW

      Comment


      • #33
        Jeff: can you employ this panel poisson approach if y2 is binary (but time varying) and is interacted with another exogenous variable (X2) ? If not, do you have suggestions to correct endogeneity of y2 x X2 interaction?

        I found one of your older posts along the same lines: https://www.stata.com/statalist/arch.../msg00188.html

        Thank you!

        Comment


        • #34
          Originally posted by Jeff Wooldridge View Post
          Álvaro: What it means is that allowing for spending to be correlated with heterogeneity appears to be enough in this application. In other words, there seems to be no remaining endogeneity (with respect to the idiosyncratic errors) once the Chamberlain-Mundlak device is used for the lavgexp variable.

          JW
          Dear Prof. Jeff,

          I have a similar question to this thread and my situation might be harder. I have a panel (firm-year) data. The dependent variable is count and the endogenous explanatory variable is also count. The IV that I considered is time-invariant. Is it appropriate to apply a control function with firm and year fixed effects? If not, do you have any recommendations? Thanks a lot!

          Comment


          • #35
            Dear Jeff Wooldridge

            What would be the case in your #12 above applying the Lin/Wooldridge approach for xtrge, fe in the case of having data that is indexed by say states and municipalities for each time period?

            Thank you!
            Best,

            Miguel

            Comment


            • #36
              Dear All,
              I am trying to follow Prof. Jeff Wooldridge 's advise to estimate a two-step fixed effects Poisson IV. I am having some difficulty in programming the bootstrap procedure to get the correct standard errors. I would be very grateful for help in correcting the following code:

              Code:
              .         capture program drop myboot
              
              .         program define myboot, rclass
                1.                 preserve
                2.                         bsample 100, cluster(id)
                3.                         * first stage
              .                         xtset id year
                4.                         xtreg LgrcxtotC i.year if (numbersibs>1), fe /*cluster(id)*/
                5.                         predict double LgrcxtotChat_fe, e
                6.                         * second stage                  
              .                         xtpoisson LgAneedc      Lgm      LgrcxtotChat_fe i.year if (numbersibs>1), fe vce(robust
              > ) 
                7.                         
              .                         return scalar bLgAneedc = _b[LgAneedc]
                8.                         return scalar bLgm = _b[Lgm]
                9.                         return scalar bLgrcxtotChat_fe = _b[LgrcxtotChat_fe]
               10.                         
              .                         return scalar seLgAneedc = _se[LgAneedc]
               11.                         return scalar seLgm = _se[Lgm]
               12.                         return scalar seLgrcxtotChat_fe = _se[LgrcxtotChat_fe]
               13.                         
              .                 restore
               14.         end
              
              . 
              .         bootstrap r(bLgAneedc) r(bLgm) r(bLgrcxtotChat_fe) r(seLgAneedc) r(seLgm) r(seLgrcxtotChat_fe), reps(500
              > ) seed(123) cluster(id) idcluster(newid): myboot
              (running myboot on estimation sample)
              repeated time values within panel
              an error occurred when bootstrap executed myboot
              r(451);
              I want to panel bootstrap and thus in line 2 above I specify bsample 100, cluster(id). Yet, I get an error stating repeated time values within panel. I am of course unknowingly making some basic error, but I would be very grateful help with a solution.

              Many thanks in advance for any help you may be able to offer.
              Sincerely,
              Sumedha.


              Comment


              • #37
                Originally posted by Jeff Wooldridge View Post
                Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

                The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

                Code:
                xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe
                predict double v2h_fe, e
                xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)
                The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

                Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

                Once people like you start to use this, it will really catch on. ;-)

                link_to_paper
                Dear Jeff Wooldridge ,

                Would you have any suggestions as to how to tweak this approach when there is an interaction term present?
                My model is as follows:

                xtreg y x1 x2 x1#x2 controls i.time, fe

                where y is a count variable, and x1, x2 are both continuous.
                My instruments are z1 for my endogenous regressor x1, and z1#x2 for the interaction term "x1#x2"

                Would the following approach work?
                gen z_int = x1*x2
                gen z_iv = z1*x2

                xtreg x1 z1 x2 controls i.time, fe vce(robust)
                predict double v2h1_fe, e
                xtreg z_int z_iv x2 controls i.time, fe vce(robust)
                predict double v2h2_fe, e
                xtpoisson y x1 z_int v2h1_fe v2h2_fe x2 controls i.time, fe vce(robust)

                If not, what would be the best way to address this concern?

                Thanks!

                Comment


                • #38
                  Hi Jeff Wooldridge,

                  I have the same question as Reeju. If I have a interaction term in my model (potential endogeneus variable is interacting with an independent variable). How can I deal with this trouble?

                  My basic model is as follows:

                  Code:
                  pplmhdfe Y X1 X1#X2 controls, a(importers HSsection) vce(cluster distance)
                  where Y is a continuous variable, and X1 is the potential endogenuous variable.

                  So, when I tried to apply the test, I am not sure about including this interaction term into the test process of the X1 variable.

                  Do you have any solution for this?

                  Comment


                  • #39
                    Originally posted by Jeff Wooldridge View Post
                    Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

                    The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

                    Code:
                    xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe
                    predict double v2h_fe, e
                    xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)
                    The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

                    Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

                    Once people like you start to use this, it will really catch on. ;-)

                    link_to_paper
                    Does someone have the paper mentioned by Prof. Wooldridge? The link is not working.

                    Comment


                    • #40
                      Mauricio Carvalho I believe this is the paper Prof. Wooldridge was referring to. It is a 2019 paper by Lin and Wooldridge included as a chapter in a book, which addresses the issue of non-linear models with endogeneity.

                      I have a doubt that may be elementary, but is crucial to my current project, and relevant to this thread. Would it be problematic to have a binary (dummy) instrumental variable? The dummy IV I have satisfies the first-stage, but is then omitted due to co-linearity in the Poisson FE with control

                      Comment


                      • #41
                        Thank you very much for posting the link, Andrew!


                        I also have a question following this topic of PPML-IV with FE. I have been trying to implement the PPML FE IV with the control function and I am having the "variance matrix is nonsymmetric or highly singular" problem. I don`t know if it is also elementary, but it only occurs in the second-stage ppmlhdfe where the "v2hat" from the fist-stage is plugged in. That is, on its own, both estimations works just fine without the warning. But when I combine both then I get the warning. If I understand it correctly by reading older posts in the statalist the cause of the problem might be because of having many categories of the fixed effects (municipalities and/or time#sectors) where only one observation attends. However, I could not find them at all.

                        My code is something like

                        Y = dependent variable (count)
                        X2 = EEV
                        Z = instrument
                        X1, X3 and X4 = control variables (all of them are continuous)

                        * First-stage
                        reghdfe X2 Z X1 X3 X4, absorb(municipality i.sector#i.year) res
                        predict double v2hat, r
                        * Second-stage
                        ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(municipality i.sector#i.year)

                        If I run, for instance,

                        * First-stage
                        reghdfe X2 Z X1 X3 X4, absorb(municipality) res
                        predict double v2hat, r
                        * Second-stage
                        ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(municipality)

                        or

                        * First-stage
                        reghdfe X2 Z X1 X3 X4, absorb(i.sector#i.year) res
                        predict double v2hat, r
                        * Second-stage
                        ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(i.sector#i.year)


                        It works fine. Only when I absorb both FE I get the warning.

                        Thank you all in advance!
                        Last edited by Mauricio Carvalho; 28 Feb 2023, 17:10.

                        Comment


                        • #42
                          Originally posted by Mauricio Carvalho View Post
                          Thank you very much for posting the link, Andrew!


                          I also have a question following this topic of PPML-IV with FE. I have been trying to implement the PPML FE IV with the control function and I am having the "variance matrix is nonsymmetric or highly singular" problem. I don`t know if it is also elementary, but it only occurs in the second-stage ppmlhdfe where the "v2hat" from the fist-stage is plugged in.
                          I wonder, does this same issue occur if you were to run the PPML estimation without the v2hat regressor? If so, then the issue may have something to do with the fixed effects you are absorbing and the other explanatory variables you are using. I.e. perhaps X3 is already being taken into account when you absorb both municipality and the interaction between sector and year. I would imagine this would just lead to X3 being omitted due to collinearity, however.

                          Those, are my two cents, but hopefully someone with more expertise can help out.

                          P.S. When using the absorb option for the reghdfe and ppmlhdfe packages, I don't believe it is necessary to absorb by creating dummies (using i.year, for example). It should work just as fine just using the variable themselves.

                          Comment


                          • #43
                            Hi, Andrew

                            Once more, thank you very much for your answer!

                            No, it not occurs when I run the PPML estimation without the v2hat regressor. It only happens when the v2hat is plugged in. Anyways, I think you have a good point, though.
                            Looking for an answer to this issue I think the way to solve this might be a theoretical one.

                            From -reghdfe - helpfile:
                            Warning: in a FE panel regression, using robust will lead to inconsistent standard errors if, for every fixed effect, the other dimension is fixed. For instance, in a standard panel with individual and time fixed effects, we require both the number of individuals and periods to grow asymptotically. If that is not the case, an alternative may be to use clustered errors, which as discussed below will still have their own asymptotic requirements. For a discussion, see Stock and Watson, "Heteroskedasticity-robust standard errors for fixed-effects panel-data regression," Econometrica 76 (2008): 155-174.
                            I have only T=3. I am not sure, but I believe the same argument is valid when using -ppmlhdfe-.

                            P.S: Thank you!

                            Comment


                            • #44
                              Dear Jeff Wooldridge

                              I am very happy to read your discussion and find they are accommodating in understanding my case.
                              I got a similar problem to Dante Donati. But my dep. var. is count discrete and my indep. var is endogenous.

                              Is there any proposed solution to the matter, Jeff Wooldridge?

                              Thank you so much for your consideration.

                              Best regard,
                              Tivea VORN

                              Comment


                              • #45
                                Thank you Jeff, this was extremely helpful to read. (Also thanks for your amazing research and textbooks, but let me get to the point...)

                                I have two pretty questions which an actual econometrician are sure to find pretty ignorant, but here goes.

                                1. In the control function regression (xtpoisson), we definitely still need to include the instrument? I've seen it done without. For example, the example in these Cameron notes seem to omit it in their example, although it's included in the equation on page 17. https://cameron.econ.ucdavis.edu/nhh...ount_part2.pdf

                                2. What if y2 and v2h_fe are pretty highly correlated here? Especially once we control for, say, year and firm fixed effects in a panel setting? In my case, the coefficient on y2 increases 4x (from .007 to .028), and the coefficient on v2h_fe is almost as large and negative (-.024). The correlation between y2 and v2h_fe is .79, and when I run an OLS reg with y2 as the dependent variable, the R2 is .98, and the variance inflation factor (VIF), computed after the poisson regression using "vif, uncentered" in Stata, is 296.

                                Isn't this problematic? And, what can/should be done?

                                Huge thanks in advance.

                                Comment

                                Working...
                                X