sum of specific observations in panel data

Andrew

Join Date: Apr 2014

Posts: 5
#1

sum of specific observations in panel data

07 Apr 2014, 14:22

Dear statalists,

I have data on bilateral trade between the reporter and partner. In particular, I have the exports from the reporter to the partner for the years 2000-2005. I have uploaded a snapshot that illustrates what my data looks like. I want to find the total trade flows between each pair of country. In other words, I want to sum the exports from one pair (e.g albania argentina) plus the exports from the inversed pair (argentina albania). I know that for a specific year we can use :
. generate first = cond(reporter < partner, reporter, partner) . generate second = cond(reporter < partner, partner, reporter)
. egen total = total(trade), by(first second) However, now I have many years and I do not know how to do this.

Regards,
Andrew
Attached Files

statalist.xlsx (486.3 KB, 1 view)
Tags: None
Carlos Avellaneda

Join Date: Mar 2014

Posts: 23
#2

07 Apr 2014, 14:49

You almost got it. Once you have created the variables you just described (but omitting the egen part), then you just have to collapse your data. Do remember that collapse changes completely your dataset so you might want to save before you use it. Here's the code:

Code:

generate first = cond(reporter < partner, reporter, partner) generate second = cond(reporter < partner, partner, reporter) collapse (sum) lnexports, by(first second year)

Hope this helps.
Carlos
Comment
Andrew

Join Date: Apr 2014

Posts: 5
#3

07 Apr 2014, 15:42

Thanks a lot...it works indeed. However, I was wondering whether I could use egen total = total(trade), by(first second year). I tried it and it works as well. So my question is whether you agree or disagree with this command. I know that egen will actually repeat the pairs. However, in my model I have some variables such as government effectiveness that is added as a charcteristic of the importer. Hence, the regression will have to run all over the pairs. Does this make any sense?

Andrew
Comment
Carlos Avellaneda

Join Date: Mar 2014

Posts: 23
#4

07 Apr 2014, 16:55

Just realized something. You want to sum the exports on pairs but your export variable is transformed into a logarithm. So you are doing something like this (because of logarithms' properties):
log(x_i) + log(y_i) = log(x_i*y_i)
I don't think that is you true intention, isn't it? You'll have to make the calculation with the untransformed variable if you still want to keep with that methodology. As for your second question, to me it doesn't make sense...you'll be running a regression with all duplicated observations in that new variable that measures the trade flow between pairs.

Hope this helps.
Carlos
Comment
Andrew

Join Date: Apr 2014

Posts: 5
#5

07 Apr 2014, 17:07

I realised as well that so I did the calculation without the log trasformation. As far as the second question thanks a lot for the advice. It does make sense what you suggest

Andrew
Comment
Andrew

Join Date: Apr 2014

Posts: 5
#6

07 Apr 2014, 17:17

Could I also ask if there is any way to get rid of the duplicate observations on the current dataset? I have a dataset with again the reporter, partner, total trade between them as well as many other variables for a certain year. As I have already constructed the dataset based on the egen total = total(trade), by(first second ) command I would like to avoid a series of mergers of different data. Therefore, could I somehow drop the duplicates or do I have to start from scratch?

Andrew
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3404
#7

08 Apr 2014, 03:43

The exact code critically depends on which observations you want to keep. For example, bys first second (year) : keep if _n == 1 will keep the first year of each country pair. If you want to keep another year you will need to change the code accordingly.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35207
#8

08 Apr 2014, 04:51

It's a while back in the thread, but it's puzzling that you post an Excel spreadsheet to a Stata forum. I will just say that it's inaccurate to guess that all Stata users are able and willing to use MS Excel to look at their data. A directly readable Stata dataset is the natural standard here. (I know about import excel, but that's not the point.)
Comment

Announcement

sum of specific observations in panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment