Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sum of specific observations in panel data

    Dear statalists,

    I have data on bilateral trade between the reporter and partner. In particular, I have the exports from the reporter to the partner for the years 2000-2005. I have uploaded a snapshot that illustrates what my data looks like. I want to find the total trade flows between each pair of country. In other words, I want to sum the exports from one pair (e.g albania argentina) plus the exports from the inversed pair (argentina albania). I know that for a specific year we can use :
    . generate first = cond(reporter < partner, reporter, partner) . generate second = cond(reporter < partner, partner, reporter)
    . egen total = total(trade), by(first second) However, now I have many years and I do not know how to do this.

    Regards,
    Andrew
    Attached Files

  • #2
    You almost got it. Once you have created the variables you just described (but omitting the egen part), then you just have to collapse your data. Do remember that collapse changes completely your dataset so you might want to save before you use it. Here's the code:
    Code:
    generate first = cond(reporter < partner, reporter, partner)
    generate second = cond(reporter < partner, partner, reporter)
    collapse (sum) lnexports, by(first second year)
    Hope this helps.
    Carlos

    Comment


    • #3
      Thanks a lot...it works indeed. However, I was wondering whether I could use egen total = total(trade), by(first second year). I tried it and it works as well. So my question is whether you agree or disagree with this command. I know that egen will actually repeat the pairs. However, in my model I have some variables such as government effectiveness that is added as a charcteristic of the importer. Hence, the regression will have to run all over the pairs. Does this make any sense?

      Andrew

      Comment


      • #4
        Just realized something. You want to sum the exports on pairs but your export variable is transformed into a logarithm. So you are doing something like this (because of logarithms' properties):
        log(xi) + log(yi) = log(xi*yi)
        I don't think that is you true intention, isn't it? You'll have to make the calculation with the untransformed variable if you still want to keep with that methodology. As for your second question, to me it doesn't make sense...you'll be running a regression with all duplicated observations in that new variable that measures the trade flow between pairs.

        Hope this helps.
        Carlos

        Comment


        • #5
          I realised as well that so I did the calculation without the log trasformation. As far as the second question thanks a lot for the advice. It does make sense what you suggest

          Andrew

          Comment


          • #6
            Could I also ask if there is any way to get rid of the duplicate observations on the current dataset? I have a dataset with again the reporter, partner, total trade between them as well as many other variables for a certain year. As I have already constructed the dataset based on the egen total = total(trade), by(first second ) command I would like to avoid a series of mergers of different data. Therefore, could I somehow drop the duplicates or do I have to start from scratch?

            Andrew

            Comment


            • #7
              The exact code critically depends on which observations you want to keep. For example, bys first second (year) : keep if _n == 1 will keep the first year of each country pair. If you want to keep another year you will need to change the code accordingly.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                It's a while back in the thread, but it's puzzling that you post an Excel spreadsheet to a Stata forum. I will just say that it's inaccurate to guess that all Stata users are able and willing to use MS Excel to look at their data. A directly readable Stata dataset is the natural standard here. (I know about import excel, but that's not the point.)

                Comment

                Working...
                X