Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difficulty merging panel datasets

    I have two datasets that I want to merge. One dataset contains the variables: state, year1, year2, year3, etc. The second data set has the following variables: year, categorical variable, state1, state2, state3, state4, etc. So the first dataset has one variable that describes the state for each observation and the other dataset has a variable for each state. What is the most effiecient way for me to create a state variable for the second dataset so that I can merge the two datasets by state?

  • #2
    I'm going to assume that in the first data set the variables are not really named year1, year2, etc., but something like u2000, u2001, u2002... Similarly I'll assume that in the second data set you have not variables named state1, state2, etc. but ALABAMA, ARKANSAS, CONNECTICUT, etc. and that these variables contain information about those states in the observation's year that might warrant having an informative name. For now I'll use the name v for that. And what you want to end up with is a state-year panel data set.

    So dataset1 looks like this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str1 state float(u2000 u2001 u2002 u2003 u2004 u2005 u2006 u2007 u2008 u2009 u2010)
    "A"  .9472316   .9484983  .20113812   .7800556  .2258395    .912768    .587661   .40200615  .5508492  .4575412  .5950673
    "B" .05222337  .11216265   .9874877 .015391795  .7590426   .7012697  .58383894    .2779583  .4874784  .4056266  .8800637
    "C"  .9743183   .4809064   .2985383    .574749  .9605365 .017440306   .6502236    .2021227  .7211111 .10792727  .9440667
    "D"  .9457484   .9763448   .8969765   .6551434  .8826448   .7283185    .421985    .3003853  .6848539  .7559011  .3617445
    "E" .18564783  .12549753  .08119465  .53569984 .32789275   .1590071   .7049119    .9302015  .3321862  .9501313    .29709
    "F"  .9487334   .7655026    .660214    .399902  .3513183  .24487182 .071344234    .3215277  .6343603  .1730719 .17099117
    "G"  .8825376 .035859343   .6283849    .356696  .3247132   .3063067   .9762383  .006623633  .8705296 .15291964  .8399968
    "H"  .9440776  .07023593 .019561933   .7333733 .05225784  .10421273    .470753    .6270677  .4043443  .9149532  .3256108
    "I" .08942585  .21017867  .20679154   .5795121  .7021965  .20672686   .9566074 .0089331865 .12353604  .6809552  .6692844
    "J"  .7505445   .6616006   .8097933   .8314545  .8886811   .8505134    .058898    .3961149  .4398758  .3250927  .2747885
    end

    And dataset2 looks like this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(year A B C D E F G H I J)
    2000  .42323855  .7526946  .5482864   .458869 .11324778  .4706297 .09931891     .8159919  .6922931  .9379712
    2001  .14427297   .627162  .8825152  .9604624  .4035936 .02750407 .11855792 .00028821192  .8599574  .8324136
    2002   .7595631     .6639  .3109555  .4984015  .7056643  .4280561  .7919037     .3804336 .13414234 .03983431
    2003   .5321585 .18422593 .25469995  .8726584  .5731504  .6700398    .06659    .12547621  .4723527  .3430923
    2004  .20126833 .28977874  .6928065 .04213694 .09911993  .4825758  .8575507     .8984238  .4374269  .4476398
    2005   .9941382 .54863846  .1083876  .8171654   .992569 .25556746  .8894401     .6826897  .8328871  .7155131
    2006   .4353405 .58717006  .3790731  .3534296  .3708564  .4384699  .5142699   .003448891 .10340178 .06749756
    2007 .021285385 .24046357  .3164629  .7544023  .6126292 .31582925  .3910225     .8822002  .4531727  .1154926
    2008   .7446451  .2239969  .8641324   .766519  .8929192  .0935955   .372983     .5584789 .19332026  .2044689
    2009   .0918318  .5621675 .14159387  .4196481 .56029755 .02711774  .6078497     .7529196  .5459082  .5137122
    end
    Then to make them compatible for -merge-ing and combine them into a panel data set, you would do this:
    Code:
    use dataset2, clear
    rename (A-J) v=
    reshape long v, i(year) j(state) string
    tempfile holding
    save `holding'
    
    use dataset1, clear
    reshape long u, i(state) j(year)
    merge 1:1 state year using `holding'
    Now, if this is not what your data sets actually look like, the code may fail. If that happens, please post back, and instead of trying to describe your data sets (almost always a difficult, if not hopeless, task) show example data of both using the -dataex- command, as I have done here. -dataex- is the most effective way to show example data on Statalist.

    If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Comment

    Working...
    X