xtpoisson with fixed effects

Venkat Jamuni

Join Date: Oct 2019

Posts: 3
#1

xtpoisson with fixed effects

25 Oct 2019, 20:29

Hello all,

I have a panel dataset of 3 million observations, with each observation detailing the annual number of prescriptions and patients for a physician for a given year for certain classes of medications. The years of available data are 2013-2017. There is also some physician demographic information including gender, specialty, years in practice, state, and number of group practice members.
For example, an observation may say "Dr. John Smith, year 2013, M, orthopedic surgery, 100 opioid prescriptions, 200 patients". The next row may say "Dr. John Smith, year 2014, M, orthopedic surgery, 80 opioid prescriptions, 170 patients". The code I used to define the panel data is

Code:

xtset npi year

(where npi is the provider id)

I am interested in using the xtpoisson command to look at the within provider change in opioid prescribing over time, with number of patients (bene_count) seen per provider per year as the exposure. The code that I have used is:

Code:

xtpoisson opioid_claim_count i.year i.provider_sex i.provider_state i.specialty i.years_experience i.group_members, exposure(bene_count) fe vce(bootstrap, reps(200)) irr

Code:

opioid_claim_count | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------------+---------------------------------------------------------------- year | 2014 | .9545114 .0002171 -204.70 0.000 .954086 .954937 2015 | .8897576 .0002035 -510.75 0.000 .8893589 .8901565 2016 | .8511887 .0001957 -700.96 0.000 .8508053 .8515723 2017 | .7981606 .0001862 -966.50 0.000 .7977958 .7985256 ln(bene_count) | 1 (exposure) ------------------------------------------------------------------------------------

For the purpose of brevity, I did not copy and paste the results from the categorical variables, although all were omitted from the regression (except for provider_state) as they are time invariant (i.e. gender, specialty, etc).

I just wanted to confirm a few things:

1) Is it OK to use year as the main independent variable with total patients per year seen (bene_count) as the exposure to look at within provider change in opioid prescribing (dependent variable) each year?

2) If so, is the correct interpretation to say something like "opioid prescriptions relative to total number of patients decreased by ~ 5% each year per provider"

3) Is there a good way to use the margins command to get an accurate number of opioid prescriptions per provider? I've tried using the following code after the xtpoisson:

Code:

margins year, atmeans ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- year | 2013 | 4.801408 .0134106 358.03 0.000 4.775124 4.827692 2014 | 4.754852 .013413 354.50 0.000 4.728563 4.781141 2015 | 4.684602 .013413 349.26 0.000 4.658313 4.710891 2016 | 4.640287 .013413 345.95 0.000 4.613998 4.666576 2017 | 4.575963 .0134132 341.15 0.000 4.549673 4.602252 ------------------------------------------------------------------------------

But I'm not sure what to do with these margin numbers nor how to interpret them. Are they ln transformed? I've read some other threads on this forum that using margins after fixed effects to calculate a dependent count variable can be somewhat inaccurate. I'm happy to just use the IRR if this is the case.

Thank you!

Venkat
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 2963
#2

26 Oct 2019, 02:39

Dear Venkat Jamuni,

Starting from the bottom; margins after xtpoisson with FE produces meaningless results. I have been promised that Stata would disable margins after xtpoisson and xtlogit with FE but so far that has not been done.

I never work with IIR, so I will not comment on your second question.

Finally, I would use log of bene_count as a regressor rather than imposing that it has a coefficient of 1.

Best wishes,

Joao
PS: you should use robust standard errors, not bootstraped.

Last edited by Joao Santos Silva; 26 Oct 2019, 02:40. Reason: Added ps.
Comment

Venkat Jamuni

Join Date: Oct 2019
Posts: 3

26 Oct 2019, 06:31

Thank you very much Joao for your prompt and helpful response. I change up my code as per your suggestion and decided to use coefficients instead of IRR, which now reads

Code:

xtpoisson opioid_claim_count i.year i.provider_sex i.provider_state i.specialty i.years_experience i.group_members ln_bene_count, fe vce(robust)

with the result being

Code:

------------------------------------------------------------------------------------
                   |               Robust
opioid_claim_count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
              year |
             2014  |  -.0535191   .0038214   -14.01   0.000    -.0610089   -.0460294
             2015  |  -.1220029   .0049468   -24.66   0.000    -.1316986   -.1123073
             2016  |  -.1692856    .005982   -28.30   0.000      -.18101   -.1575612
             2017  |  -.2395579   .0064522   -37.13   0.000     -.252204   -.2269118

Would the correct interpretation of this result be "in 2014, there was a decrease in opioid prescriptions of 5.3 per 100 patients PER PROVIDER compared to 2013. In 2015 there was a decrease in opioid prescriptions of 12.2 per 100 patients PER PROVIDER compared to 2013." and so on?

And just for my education, what is the advantage of using ln_bene_count as a regressor rather than as an exposure?

Last edited by Venkat Jamuni; 26 Oct 2019, 06:33.

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 2963
#4

26 Oct 2019, 09:26

Dear Venkat Jamuni,

Using ln_bene_count as a regressor rather than as an exposure does not force its coefficient to be 1 and therefore is more flexible.

I am not sure if I fully understand your set-up, but I would say that the interpretation is that, keeping everything else constant, from 2013 to 2014 there was a change of 100(exp(-.0535191)-1)% = -5.211216% in the mean number of opioid prescriptions. I believe that the reference to "per 100 patients" is not needed and I do not understand why you say that the change is per provider, but I may be missing something.

Best wishes,

Joao
Comment
Venkat Jamuni

Join Date: Oct 2019

Posts: 3
#5

26 Oct 2019, 15:01

Thank you Joao that makes a lot of sense. The only reason why I included the terminology per provider is because everything I have been reading about fixed effects models is that they allow you to examine the change within entities in panel data as opposed to across entities. In this case the entity being physician.

Furthermore, the reason why I thought the interpretation should be relative to number of patients is because when you remove ln_bene_count as a regressor, the results are very different.

Code:

------------------------------------------------------------------------------------ | Robust opioid_claim_count | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------------+---------------------------------------------------------------- year | 2014 | .0228594 .0014565 15.70 0.000 .0200047 .025714 2015 | .0130477 .0023081 5.65 0.000 .008524 .0175714 2016 | .0093051 .002801 3.32 0.001 .0038152 .014795 2017 | -.025449 .003365 -7.56 0.000 -.0320442 -.0188538

The other option I was entertaining was merely summing the total number of opioid prescriptions / total number of patients across all providers each year to get the overall prescribing rate. This would be another way to examine the change in opioid prescribing over time, but I was more interested in looking at average yearly change in opioid prescriptions per provider, which was my thinking behind the fixed effects model. Is this rationale correct?

Last edited by Venkat Jamuni; 26 Oct 2019, 15:10.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2963
#6

27 Oct 2019, 08:45

Dear Venkat Jamuni,

Yes, if you remove the number of patients the results would be different; my sentence "keeping everything else constant" implies that you keep fixed the number of patients, the provider time-invariant characteristics, and any other regressor in your model. For clarity, you should perhaps be more explicit about what is being controlled for than I was.

Because your regressor of interest only varies over time, and most other characteristics do not vary over time, your second approach may lead to results that are not very different form what you get with fixed effects. It is worth trying it and maybe you can share your results in this forum.

Best wishes,

Joao
Comment

Announcement