Introducing mlsynth

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#1

Introducing mlsynth

16 Jan 2025, 13:22

Hey everyone. For those of you who use Python and are interested in causal inference, you may be interested in my new Python library, mlsynth. mlsynth implements 15 individual estimators, only two of which are avaliable in Stata at present (Forward DID, also written by me!!!, and generic synthetic control methods).

lots of the methods in mlsynth use machine learning methods to augment traditional causal inference estimators such as k means, matrix factorization, and penalized regression, and are suited for the high dimensional setup where we have very many controls units relative to our number of pre-treatment periods. At some point, I may write this as a Python-integrated Stata command and send it to SJ for publication, but the simplicity of use is such that this may not be necessary... anyways, I linked to the documentation which itself links to the Github repo, where you may install it from, should you like.
Tags: None

1 like
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#2

16 Jan 2025, 13:50

Thanks Jared, I'm going to forward this to a friend of mine who does some work at the intersection of causal inference and machine learning. Could you also recommend some materials that I can use to learn more about these methods? Even some kind of introduction to synthetic control would be useful.
1 like
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#3

18 Jan 2025, 20:16

Daniel Schaefer let's see..... Abadie's 2021 paper in JEL is always a great resource for SCM, as his paper with Jaume Vives (his former RA), "Synthetic Controls in Action". Those two are the most foundational resources for SCM. For Forward DID, Kathy's paper! I had to read it like 4 times to understand what the heck was going on (I even went through some of the proofs).

For the panel data approach class, HCW and the fsPDA papers (i link to both in the documentation, but HCW's paper I think is just so foundational because it provides justification for the broader class of artificial counterfactual methods (especially their emphasis on WHY we use the outcomes of our control units in the first place, Lemma 6 I think).

I don't mean to just give the answer of "Read the literature", but honestly that's sorta how i did it, i just had to take time to really read and try and understand some of the more basic "explainer" papers.

One resource that I actually think is fantastic, aside from the published academic literature, is Causal Inference for the Brave and True. The amount is material in the book is INSANE. It's written in Python, but everything is explained very clearly (especially the DID/SCM chapter).

I think for me, the moment I realized "Hey lots of this stuff" (putting it politely) "is just sexed up OLS with constraints or decision criteria", I think it opened up the proverbial map for me, so to speak, to see lots of these methods as very interconnected, even though they are distinct. I don't know if that helps at all😂 but I'm more than happy to go into more detail should you wish
1 like
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#4

21 Jan 2025, 11:16

Thanks Jared! Really great to hear directly from someone working at the cutting edge of causal inference methods!
Comment

Announcement

Introducing mlsynth

Comment

Comment

Comment