Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Suggested resources on implementing and checking crosswalks?

    Hi forum users,

    Is anyone aware of any useful pedagogical resources from which I might be able to practice implementing a crosswalk for a large panel in different ways, as well as QA-ing the codes? (e.g., Indonesian district proliferation, re-basing to earlier boundaries by collapsing, adding, reweighting duplicates etc.)

    I'm strugging to find any comprehensive resources or examples, including in others' papers and public data. Rather, it seems buried before the analysis starts in the papers; online and textbook guides tend to treat each command separately, rather than explain how it might all fit together (noting that it is probably not too difficult to figure out..).

    Grateful for any advice, and apologies for the potentially rudimentary question,

    Kind regards,

    Ryan


  • #2
    Not a rudimentary question; rather a really important one for any one who uses survey or other data that have been used by others before them. But won't answers to questions of the sorts you raise inevitably be data source specific? Moreover, a "cross-walk" implies aligning "from" something "to" something, and so one would need to know the nature of the benchmarks (target and source) in order to comment: changing the benchmark changes the nature of the cross-walk. All this suggests that you need to be more explicit about the nature of your data source (Indonesian Family Life panel survey -- I'm guessing!!), and the nature of the cross-walk required.

    Comment


    • #3
      Thanks for the prompt reply, Stephen.

      My base data set is the relatively new public one from the World Bank, DAPOER, available with the cross walk here: http://data.worldbank.org/data-catal...nomic-research I just want to be able to comfortably reset to earlier years, and understand it is just a merge and then managing the duplicates carefully (e.g., they are 'denominated' differently).

      I'm merging it with information from other data less tidy than this (e.g., susenas, podes, prices, geological variables; all aggregated up to the district level though), so to start with, I was just after some resources which might help me identify and practice all the commands I'd need to master to be able to problem-solve my way around the many different district codes in a more efficient way than I currently am--hence the lack of specifics above, sorry.

      Cheers
      Ryan
      Last edited by Ryan B. Edwards; 08 Dec 2014, 16:17.

      Comment


      • #4
        The Missouri Census Data Center has a tool called MABLE/Geocorr for linking different types of geographic identifiers for different time periods and different levels of data aggregation. They may have some general resources for working with geographic identifiers: http://mcdc.missouri.edu/websas/geocorr12.html

        Also, you could try the geography page of the US census bureau: http://www.census.gov/geo/index.html

        Mike

        Comment


        • #5
          Hi all,

          In case anyone comes across similar issues, I thought it might be worthwhile closing this off below. Turns out it is quite easy, at least in my case.

          To rebase districts (or other units in 'changing panels') to original boundaries, it is just a matter of having 'parent' identifiers, or old codes, for everyone, then using the collapse command appropriately. For example, for land, population, and level variables, you would 'sum'. For climate variables taken in district means, you might wish to 'mean' collapse weighting by area size. And for socioeconomic characteristics, such as poverty rates, mean household expenditure, you may wish to 'mean' collapse weighting by population, for example. If you do this separately, they can just be put back together by merging on parent id and time period.

          To generate unique identifiers, all the parents need recoded into babies at the time of the splits (such that, for example, the "USSR" turns into "Russia", in an international panel). Fortunately someone had already done this, so it saved me using my convoluted approach of generating a split dummy from every split, then recoding all those with different ids and dummies as new ones. Plus, there are many ways to do this if you have detailed data on the splits and changing of boundaries, and its beyond the scope of my original question, so I'll leave that there.

          Cheers
          Ryan

          Comment

          Working...
          X