Partial-out variables with respect to multiple levels of fixed-effects for Poisson regression (Frisch-Waugh-Lovell for PPML)

Sergey Alexeev

Join Date: Oct 2016

Posts: 30
#1

Partial-out variables with respect to multiple levels of fixed-effects for Poisson regression (Frisch-Waugh-Lovell for PPML)

28 Oct 2023, 05:45

Dear community,

I need to produce a jackknife variance estimator for a Poisson regression that implements a double difference design (difference-in-difference)

This is the command I run

glm ///
Naloxone /// Outcome on interest
HN /// Treatment dummy
i.t c.t##i.mt /// Double difference with flexible trends
LogPoliceRate /// Controls
trpf physicianexam /// Controls
pharmacistverification requireid /// Controls
T_GS_HasLaw pdmp doctorshopping painclinic /// Controls
, ///
family(poisson) ///
vce(jackknife, cluster(st) idcluster(ST)) //

Jackknife fails to compute, i.e., all crosses are red. I suspect that this is a computational issue driven by a large number of fixed effects. Therefore, I'd like to partially out fixed effects (i.t c.t##i.mt) and run ppml without them. Section 3.4. of Cluster-robust inference: A guide to empirical practice by James G. MacKinnon and others suggests doing so for jackknife variance in the context of linear regression.

For linear models, Stata's package HDFE trivialises this task
http://scorreia.com/demo/hdfe.html
However, for Poisson regression, the implementation is unclear.

Stata's PPMLHDFE
http://scorreia.com/help/ppmlhdfe.html
as explained here
http://arxiv.org/abs/1903.01690
talks about Frisch-Waugh-Lovell for PPML, but I do know how to implement it exactly.

For concreteness, consider the following:

Code:

sysuse auto, clear * Benchmark | FWL Theorem for linear model reghdfe /// price /// weight /// length /// , /// a(turn trunk) * Demean variables hdfe /// price /// weight /// length /// , /// a(turn trunk) /// gen(RESID_) * Same point estimates as in reghdfe without fixed effects | very nice reg /// RESID_price /// RESID_weight /// RESID_length /// , /// nocons * Then the question is how to get residuals to estimate the following glm /// RESID_price /// RESID_weight /// RESID_length /// , /// family(poisson) // * so that it gives the same point estimates as this glm /// price /// weight /// length /// i.turn i.trunk /// , /// family(poisson) //

Thank you, and I hope everyone is having a fantastic day.

Warm regards,
Sergey Alexeev
https://alexeev.pw/

Last edited by Sergey Alexeev; 28 Oct 2023, 06:04. Reason: Added tags

Kind regards,
Sergey Alexeev | The University of Sydney
https://alexeev.pw/
Tags: Frisch-Waugh-Lovell, HDFE, Poisson regression, ppmlhdfe, residual
FernandoRios

Join Date: Apr 2014

Posts: 2408
#2

28 Oct 2023, 07:07

Hi Sergey
the bottom line is you can’t apply partial in out with non linear models. That only applies to linear regression
With ppmlhdfe adding fixed effects is possible because it won’t suffer from incidental parameters.
however you can’t pre-process the data
perhaps a better question is, why do you need to do jackknife?
perhaps you can do that using the score of the coefficients?
1 like
Comment
Sergey Alexeev

Join Date: Oct 2016

Posts: 30
#3

29 Oct 2023, 18:51

Originally posted by FernandoRios View Post

Hi Sergey
the bottom line is you can’t apply partial in out with non linear models. That only applies to linear regression
With ppmlhdfe adding fixed effects is possible because it won’t suffer from incidental parameters.
however you can’t pre-process the data
perhaps a better question is, why do you need to do jackknife?
perhaps you can do that using the score of the coefficients?

Dear Fernando,

Thank you for finding time to reply. Let me clarify.

1) Jackknife (CV3) likely is the optimal choice for cluster robust inference with PPML.

CRVEs (cluster-robust variance estimators) are well well-understood for linear models. The consensus is that for linear models, unbalanced clusters can be accommodated well using the wild cluster restricted (WCR) bootstrap. More on consensus:
https://doi.org/10.1016/j.jeconom.2022.04.001
https://doi.org/10.3368/jhr.50.2.317

In my setting, I have a limited dependent variable and need PPML. I talked to Professor James G. MacKinnon about CRVE with PPML.

He is currently developing a paper that compares CRVE methods for probit/logit, which is, for now, unavailable for circulation (it has been presented at the Canadian Economics Association Annual Meeting recently). His preliminary conclusion is that CV3 might be as good as WCR. He also shows that a common adoption of WCR for nonlinear models based on scores performs poorly. The score based WCR:
https://doi.org/10.1515/2156-6674.1006

The poor performance of score-based WCR for nonlinear models is also foreshadowed and discussed in these:
https://doi.org/10.1177/1536867X19830877
https://doi.org/10.1080/00220388.2013.858122

The bottom line is I need CV3 for PPML. This is where literature is going and will eventually arrive.

2) The Frisch-Waugh-Lovell theorem for nonlinear models exist

There are at least two different ways to apply FWL to nonlinear models.
https://arxiv.org/abs/1903.01690
https://arxiv.org/abs/1707.01815

The latter explains that REGHDFE and PPMLHDFE use similar routines. I don’t want to rewrite the stuff from the paper here, but you could read a paragraph that starts with “The fact that reghdfe…” or the whole paper, which is very well written. Perhaps you know all of it, so please accept my apologies for my mansplaining.

Perhaps, however, as you said, FWL can not be applied manually to PPML to avoid fixed effects directly without some other modifications. I don't really know.

3) I don’t know much about a score of coefficient. Could you suggest modifications to my code in the original post so I could try them?

Is it something like this?
https://twitter.com/instrumenthull/s...69316010389516

I am guessing that the name of the topic becomes misleading. My apologies to the community.

Thank you and warm regards
Sergey

PS. I followed you on Twitter. Looking forward to liking your tweets.

Kind regards,
Sergey Alexeev | The University of Sydney
https://alexeev.pw/
Comment

Sergey Alexeev

Join Date: Oct 2016
Posts: 30

10 Mar 2024, 16:37

Hey there,

That's how you do it. For Poisson model

Code:

jackknife coef=_b[HN], cluster(ST) mse: ///
ppmlhdfe ///
Naloxone /// Outcome on interest
HN /// Treatment dummy
LogPoliceRate /// Controls
trpf physicianexam /// Controls
pharmacistverification requireid /// Controls
T_GS_HasLaw pdmp doctorshopping painclinic /// Controls
, ///
absort(i.t c.t##i.mt) ///
eform

For the linear model, use reghdfe instead of ppmlhdfe

Kind regards,
Sergey Alexeev | The University of Sydney
https://alexeev.pw/

Announcement

Partial-out variables with respect to multiple levels of fixed-effects for Poisson regression (Frisch-Waugh-Lovell for PPML)

Comment

Comment

Comment