Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Partial-out variables with respect to multiple levels of fixed-effects for Poisson regression (Frisch-Waugh-Lovell for PPML)

    Dear community,

    I need to produce a jackknife variance estimator for a Poisson regression that implements a double difference design (difference-in-difference)

    This is the command I run

    glm ///
    Naloxone /// Outcome on interest
    HN /// Treatment dummy
    i.t c.t##i.mt /// Double difference with flexible trends
    LogPoliceRate /// Controls
    trpf physicianexam /// Controls
    pharmacistverification requireid /// Controls
    T_GS_HasLaw pdmp doctorshopping painclinic /// Controls
    , ///
    family(poisson) ///
    vce(jackknife, cluster(st) idcluster(ST)) //

    Jackknife fails to compute, i.e., all crosses are red. I suspect that this is a computational issue driven by a large number of fixed effects. Therefore, I'd like to partially out fixed effects (i.t c.t##i.mt) and run ppml without them. Section 3.4. of Cluster-robust inference: A guide to empirical practice by James G. MacKinnon and others suggests doing so for jackknife variance in the context of linear regression.

    For linear models, Stata's package HDFE trivialises this task
    http://scorreia.com/demo/hdfe.html
    However, for Poisson regression, the implementation is unclear.

    Stata's PPMLHDFE
    http://scorreia.com/help/ppmlhdfe.html
    as explained here
    http://arxiv.org/abs/1903.01690
    talks about Frisch-Waugh-Lovell for PPML, but I do know how to implement it exactly.

    For concreteness, consider the following:

    Code:
    sysuse auto, clear
    * Benchmark | FWL Theorem for linear model
    reghdfe ///
    price ///
    weight ///
    length ///
    , ///
    a(turn trunk)
     
    * Demean variables
    hdfe ///
    price ///
    weight ///
    length ///
    , ///
    a(turn trunk) ///
     gen(RESID_)
     
    * Same point estimates as in reghdfe without fixed effects | very nice
    reg ///
    RESID_price ///
    RESID_weight ///
    RESID_length ///
    , ///
    nocons
     
    * Then the question is how to get residuals to estimate the following
    glm ///
    RESID_price ///
    RESID_weight ///
    RESID_length ///
    , ///
    family(poisson) //
     
    * so that it gives the same point estimates as this
    glm ///
    price ///
    weight ///
    length ///
    i.turn i.trunk ///
    , ///
    family(poisson) //
    Thank you, and I hope everyone is having a fantastic day.

    Warm regards,
    Sergey Alexeev
    https://alexeev.pw/
    Last edited by Sergey Alexeev; 28 Oct 2023, 06:04. Reason: Added tags
    Kind regards,
    Sergey Alexeev | ​The University of Sydney
    https://alexeev.pw/

  • #2
    Hi Sergey
    the bottom line is you can’t apply partial in out with non linear models. That only applies to linear regression
    With ppmlhdfe adding fixed effects is possible because it won’t suffer from incidental parameters.
    however you can’t pre-process the data
    perhaps a better question is, why do you need to do jackknife?
    perhaps you can do that using the score of the coefficients?

    Comment


    • #3
      Originally posted by FernandoRios View Post
      Hi Sergey
      the bottom line is you can’t apply partial in out with non linear models. That only applies to linear regression
      With ppmlhdfe adding fixed effects is possible because it won’t suffer from incidental parameters.
      however you can’t pre-process the data
      perhaps a better question is, why do you need to do jackknife?
      perhaps you can do that using the score of the coefficients?
      Dear Fernando,

      Thank you for finding time to reply. Let me clarify.

      1) Jackknife (CV3) likely is the optimal choice for cluster robust inference with PPML.

      CRVEs (cluster-robust variance estimators) are well well-understood for linear models. The consensus is that for linear models, unbalanced clusters can be accommodated well using the wild cluster restricted (WCR) bootstrap. More on consensus:
      https://doi.org/10.1016/j.jeconom.2022.04.001
      https://doi.org/10.3368/jhr.50.2.317

      In my setting, I have a limited dependent variable and need PPML. I talked to Professor James G. MacKinnon about CRVE with PPML.

      He is currently developing a paper that compares CRVE methods for probit/logit, which is, for now, unavailable for circulation (it has been presented at the Canadian Economics Association Annual Meeting recently). His preliminary conclusion is that CV3 might be as good as WCR. He also shows that a common adoption of WCR for nonlinear models based on scores performs poorly. The score based WCR:
      https://doi.org/10.1515/2156-6674.1006

      The poor performance of score-based WCR for nonlinear models is also foreshadowed and discussed in these:
      https://doi.org/10.1177/1536867X19830877
      https://doi.org/10.1080/00220388.2013.858122

      The bottom line is I need CV3 for PPML. This is where literature is going and will eventually arrive.

      2) The Frisch-Waugh-Lovell theorem for nonlinear models exist

      There are at least two different ways to apply FWL to nonlinear models.
      https://arxiv.org/abs/1903.01690
      https://arxiv.org/abs/1707.01815

      The latter explains that REGHDFE and PPMLHDFE use similar routines. I don’t want to rewrite the stuff from the paper here, but you could read a paragraph that starts with “The fact that reghdfe…” or the whole paper, which is very well written. Perhaps you know all of it, so please accept my apologies for my mansplaining.

      Perhaps, however, as you said, FWL can not be applied manually to PPML to avoid fixed effects directly without some other modifications. I don't really know.

      3) I don’t know much about a score of coefficient. Could you suggest modifications to my code in the original post so I could try them?

      Is it something like this?
      https://twitter.com/instrumenthull/s...69316010389516

      I am guessing that the name of the topic becomes misleading. My apologies to the community.

      Thank you and warm regards
      Sergey

      PS. I followed you on Twitter. Looking forward to liking your tweets.
      Kind regards,
      Sergey Alexeev | ​The University of Sydney
      https://alexeev.pw/

      Comment


      • #4
        Hey there,

        That's how you do it. For Poisson model

        Code:
        jackknife coef=_b[HN], cluster(ST) mse: ///
        ppmlhdfe ///
        Naloxone /// Outcome on interest
        HN /// Treatment dummy
        LogPoliceRate /// Controls
        trpf physicianexam /// Controls
        pharmacistverification requireid /// Controls
        T_GS_HasLaw pdmp doctorshopping painclinic /// Controls
        , ///
        absort(i.t c.t##i.mt) ///
        eform

        For the linear model, use reghdfe instead of ppmlhdfe
        Kind regards,
        Sergey Alexeev | ​The University of Sydney
        https://alexeev.pw/

        Comment

        Working...
        X