Match/merge two databasis with propensity scores

Casper Schouten

Join Date: Apr 2016
Posts: 11

Match/merge two databasis with propensity scores

05 Jun 2016, 09:54

Dear Statalisters,

In my current research, I am looking into the difference of performance in REITs caused by advisory-style.
I am having troubles matching two datasets.The first time series dataset consists of data on REITs that are externally advised, and the second time series dataset consists of REITs that are internally advised.

I have data on the following company characteristics:
Tobin's Q (Q)
Age (Age)
Total Assets (TA)
Market Capitalization (MCAP)
Total debt/Total capitalization (Debt)
Funds from operations/total revenue (FFO)

I would like to merge and match the databases based on the above mentioned company characteristics using propensity scores. This way i have "comparable" companies with different advisory styles. This allowes me to build a regression of Q on a dummy of advisory style (treatment) and controll for the company characteristics:

Q=constant+b1Advisory+Age+TA+MCAP+Debt+FFO

In matching the two datasets, I want to identify also which companies are matched together, so for example company A is comparable with company XY and XZ.I have google and youtube searched a lot but I cannot figure out the right method to do this with STATA.

Does anyone know how to approach this using stata? I've tried psmatch2 but it didnt really do what i described above.

Thanks for your help, I am happy to provide additional information if necesarry

Casper

some example data:

	Company name	Total Assets (TA)	Market Capitalization (MCAP)	FFO (FFO)	Debt	Age	Property Type	Company Self-advised? Yes/No

2015Q4	company x	1447808	1961,340946	15,46336554	6,21	12	Diversified	No
2015Q3	company x	1456701	1909,717304	18,29745597	5,11	12	Diversified	No
2015Q2	company x	1417939	2093,54036	21,26556017	4,82	12	Diversified	No
2015Q1	company x	1432872	2331,38697	23,20020325	4,92	12	Diversified	No
2015Q4	company y	1418392	2232,326767	21,90280561	4,99	8	Diversified	Yes
2015Q3	company y	1465437	1909,257746	19,15522541	4,88	8	Diversified	Yes
2015Q2	company y	1449856	1886,586236	19,61093418	4,71	8	Diversified	Yes
2015Q1	company y	1452043	1843,285694	20,51079545	4,4	8	Diversified	Yes
2015Q4	company y	1454478	1685,04468	18,33333333	4,5	8	Diversified	Yes

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

06 Jun 2016, 11:24

Conventional merge won't work because the propensity scores are not identical. The user written program rangejoin might work. There may be some other fuzzy matching possible to do the merge the way you want, but I don't know what routine would do that.

By trying to do this with a merge, I think you are assuming you want the data in wide format - you want the firm and match on the same observation. Many of the estimators are designed for long format including the treatment estimators. It might be easier to just append the data and then use the treatment estimators.

Alternatively, if you really need both on the same line, you could append, then create different variables for all the internally advised. With a sort and lag variables and if conditions (e.g., you sort by date and declining propensity score, then for each externally advised you create an internally advised match with the first [_n-1] etc. observation that is not externally advised. This might give you the data layout you want. This is inelegant but might work.

Phil
Comment
ashar ata

Join Date: Nov 2014

Posts: 29
#3

06 Jun 2016, 11:45

Casper your question in not stated very clearly

If the companies' performance is measured by Q then you would write your psmatch2 code as below based on matching the companies on their propensity to be self advised and controlling for all other factors

selfadvised (yes=1, no=0)
Age (Age)
Total Assets (TA)
Market Capitalization (MCAP)
Total debt/Total capitalization (Debt)
Funds from operations/total revenue (FFO)

psmatch2 selfadvised Q age ta mcap debt ffo , out(Q) caliper() neighbor(1) noreplacement

you would need to test for the balance in the covariates that have been adjusted for and decide on a caliper for the nearest neighbor you want to match a company with.

Hope this helps.
Ashar

Last edited by ashar ata; 06 Jun 2016, 12:17.
Comment
Sebastian Geiger

Join Date: Oct 2015

Posts: 124
#4

06 Jun 2016, 13:56

I'm not quite sure if I understand you correctly: Do you want to use the internally advised firms as the control group in the matching procedure? In this case, you should make sure that the variables have the same names and then use the -append- command. By doing so, you create one dataset for both firm types. As Ashar already suggested, then, you can run psmatch2 using the advisory dummy as the treatment variable.
Comment
Casper Schouten

Join Date: Apr 2016

Posts: 11
#5

06 Jun 2016, 14:21

Dear Phil and Ashar,

Thank you for your replies. I am sorry my question was not stated clearly. I will try to clarify.

As discussed I would like to show if advisory style (Advisory) is a significant explainatory variable on Tobin's Q (Q).
To do this I want to compare external and internally advised companies. To try and "compare apples with apples", I would like to find comparable companies (based on company characteristics, using propensity scores) of both the externaly and internaly advised company databases.
In other words: Based on propensity score, Internally advised company X compares to externally advised company Y.

I will use only the comparable companies in a new dataset on which I will run a regression. The regression will be of Q on a dummy of advisory style (treatment) and controll for the company characteristics:

Q=constant+b1Advisory+Age+TA+MCAP+Debt+FFO

My Question is therefore, how do I use stata to do a propensity score match of the external and the internal companies based on the above mentioned company characteristics. Does stata also show which companies are matched? in other words: can stata tell me what internal advised company is comparable with what internal advised company?

No help is necessary on running the regression.

Thanks again for taking the time to help me.

Warm regards,

Casper
Comment
ashar ata

Join Date: Nov 2014

Posts: 29
#6

06 Jun 2016, 14:44

Originally posted by ashar ata View Post

psmatch2 selfadvised Q age ta mcap debt ffo , out(Q) caliper() neighbor(1) noreplacement

The above code would calculate the propensity score and match each self advised to company to a similar non-advised company based on regression (stata will run regression as a part of this code). The resulting data will have new variables that will show you which company in one group was matched to which company in the other group.
In the output stata will then compare the mean Q for the matched samples.

Following is a very good article about propensity scores by Peter Austin
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/
Another good article with stata commands is below: http://www.bristol.ac.uk/cmm/softwar...rop-scores.pdf
Comment
Sebastian Geiger

Join Date: Oct 2015

Posts: 124
#7

06 Jun 2016, 14:49

Casper,

Considering your explanation in #5, I think my advice in #4 is correct. First, you need to create one dataset. If the variables are named identically, you can use the -append- command which is pretty straightforward to use. Simply, tell Stata.

Code:

append using "<name_of_the_other_dataset.dta"

Then, you can use psmatch2 for performing matching using the propensity score. If you use nearest-neigbor-matching, psmatch2 will create a variable that shows which observations are matched. When you'd like to match on within strata (e.g. match only observations from the same year), you need to use a loop, which may look similar to the example given in the help file of psmatch2:

Code:

g att = . egen g = group(groupvars) levels g, local(gr) qui foreach j of local gr { psmatch2 treatvar varlist if g==`j', out(outvar) replace att = r(att) if g==`j' }
Comment

Announcement

Match/merge two databasis with propensity scores

Comment

Comment

Comment

Comment

Comment

Comment