Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assigning a code to individuals by relating one of their variables to an outside source

    Hello,

    I'm working with ACS data in Stata 13, but I have access to Stata 14 if necessary.

    ACS provides the variable MIGPUMA1 (where the respondent lived last year), which is one or many PUMAs combined. I also have the PUMA variable (where they live now). MIGPUMA1 and PUMA sometimes have the same codes, but usually their codes are different. I want to compare these areas using the larger unit of analysis, MIGPUMA1. The only "crosswalk" provided is an Excel sheet that has the PUMA code in column E and the MIGPUMA1 code in column F.

    Is there a way to create a variable, let's say MIGPUMA0 (the larger unit of analysis which would specify where the person lives now), using this Excel sheet and my existing individual level data without having to write 2081 if statements? (There are 2081 PUMA codes).

    So far, all I can think of would be something like the following, but I know the coding is wrong. This is just the best way I can think to illustrate what I want:
    Code:
    gen migpuma0=(columnF) if puma==(columnE)
    But I don't know what the code would actually be or how I would direct Stata to look at the Excel sheet (or the sheet as a secondary data file in Stata).

    Thank you for any assistance you can offer.

  • #2
    Please read the forum FAQ for excellent advice about how to post questions in ways that are likely to draw a timely and helpfl response. Among the things you will learn there:

    1. This is a multidisciplinary forum. Many, perhaps most members, will not know what the ACS is. Use no jargon; speak only in terms that would be understood by any well educated adult who knows statistics.

    2. When asking for help with code it is essential to show example data. And the best way to do that is to use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Those issues aside, read -help import excel- to learn how to get Stata to import data from Excel.

    Comment


    • #3
      Welcome to Statalist.

      Let me start with some general advice.

      Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

      The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

      Now, to your question. I can tell you that you are not going to direct Stata to look in the Excel sheet. That's the hard way. You are going to use the import excel command (or interactively use the Import item on Stata's File menu) to import the data from the Excel worksheet into Stata and then save it to your hard drive for your use.

      Once you've done that, you can post a sample of the dataset here to make it clear. I'll be honest, I found a MIGPUMS to PUMA crosswalk with Google, but the codes were not in columns E and F, and were more complicated than your pseudo-code suggest. But maybe your crosswalk was prepared by someone else and is better. Please be sure to use the dataex command to show your example data. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex to read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      When asking for help with code, always show example data. When showing example data, always use dataex. And in your reading of the FAQ will have explained why attaching the Excel workbook to a post will not be helpful.

      You should also post a sample of observations of your ACS data - just whatever identifiers it contains, including the PUMA - so we can understand how that data is stored. If the ACS data has the PUMA as a number and the import from Excel creates a string variable, we need to know that.

      Ultimately, you will want to use a merge command to join the crosswalk data onto your ACS data. You will join observations matching the PUMA in your ACS data to the PUMA in the crosswalk, and that will give you the MIGPUMA1 variable.

      Now, I'll close with one more piece of advice. Importing data from Excel and merging data are fairly basic Stata features, and because they were not in what you wrote in your post, I think it's possible you're relatively new to Stata. If so, I'm sympathetic to you as a new user of Stata - it's a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

      When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

      All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

      Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.
      Added in edit: it's embarrassing when I post an unattributed exact quote from Clyde and it crosses with him saying it himeself. What can I say - it's so good it immediately went into my FQA (Frequent Question Answers) file.
      Last edited by William Lisowski; 27 Jun 2018, 15:58.

      Comment

      Working...
      X