Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • obtaining correct absolute values

    Hello, I am using IPUMS US Census microdata. According to IPUMS
    “In 1980, responses to questions about migration were coded for only half the persons included in the IPUMS. These cases provide accurate proportional distributions but not correct absolute numbers for the general population. For correct absolute numbers, users should select cases coded as 2 in MIGSAMP and multiply by 2 as well as by PERWT.”

    I wish to convert the variable as suggested by IPUMS because all other years are 1 in 100 samples for MIGPLAC5 and I want consistency throughout the dataset. It seems straightforward enough, but I’ve run into a hiccup. I have saved a version of my dataset with only the 1980 cases that were coded as 2 in MIGSAMP (sample below). All PERWT weights for 1980 equal 100. But I can’t seem to figure out the correct code to use to achieve the correct results. I have tried—
    gen migplac80= migplac5*2*perwt
    The frequencies appeared to be the same as the original variable, which is not what I expected.
    gen migplac80=migsamp2*2*perwt
    All cases equaled 400, which also doesn’t seem correct.

    I’m guessing gen won’t work in this case because the variable holds observations and the function is likely canceling the values out or something. But I’m not sure how else to achieve the goal.
    Also, at this point the unit of analysis is the individual level. I would greatly appreciate any suggestions for how to tackle this problem. Thank you!

    Jennifer Hearst


    . dataex year sample serial statefip perwt migplac5 migsamp

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year long(sample serial) byte statefip int(perwt migplac5) byte migsamp
    1980 198001 3298260 39 100 1 2
    1980 198001 4156660 48 100 1 2
    1980 198001 1693095 21 100 1 2
    1980 198001   63565  1 100 1 2
    1980 198001   57075  1 100 1 2
    1980 198001   19160  1 100 1 2
    1980 198001   48395  1 100 1 2
    1980 198001   69120  1 100 1 2
    1980 198001   61150  1 100 1 2
    1980 198001   83945  1 100 1 2
    1980 198001   63890  1 100 1 2
    1980 198001   32050  1 100 1 2
    1980 198001   74695  1 100 1 2
    1980 198001   68880  1 100 1 2
    1980 198001  147400  4 100 4 2
    1980 198001  188990  4 100 4 2
    1980 198001  160310  4 100 4 2
    1980 198001  130210  5 100 5 2
    1980 198001  132510  5 100 5 2
    1980 198001  130225  5 100 5 2
    1980 198001  110825  5 100 5 2
    1980 198001  286845  6 100 5 2
    1980 198001  103350  5 100 5 2
    1980 198001  130025  5 100 5 2
    1980 198001  113275  5 100 5 2
    1980 198001 4218565 48 100 5 2
    1980 198001  113275  5 100 5 2
    1980 198001  115270  5 100 5 2
    1980 198001  130515  5 100 5 2
    1980 198001  646375  6 100 6 2
    1980 198001 4278495 48 100 6 2
    1980 198001  492865  6 100 6 2
    1980 198001  403035  6 100 6 2
    1980 198001 1948950 24 100 6 2
    1980 198001  558945  6 100 6 2
    1980 198001  285060  6 100 6 2
    1980 198001 2536970 37 100 6 2
    1980 198001 4112420 48 100 6 2
    1980 198001 1079010 13 100 6 2
    1980 198001  626955  6 100 6 2
    1980 198001  471855  6 100 6 2
    1980 198001  218440  6 100 6 2
    1980 198001  555695  6 100 6 2
    1980 198001 4270785 48 100 6 2
    1980 198001  609210  6 100 6 2
    1980 198001  651290  6 100 6 2
    1980 198001  915790 12 100 6 2
    1980 198001  294795  6 100 6 2
    1980 198001  269175  6 100 6 2
    1980 198001  486775  6 100 6 2
    1980 198001  437695  6 100 6 2
    1980 198001 3136975 36 100 6 2
    1980 198001  464750  6 100 6 2
    1980 198001 3063825 36 100 6 2
    1980 198001  641885  6 100 6 2
    1980 198001  525455  6 100 6 2
    1980 198001  417345  6 100 6 2
    1980 198001  364740  6 100 6 2
    1980 198001  503960  6 100 6 2
    1980 198001  478040  6 100 6 2
    1980 198001  280430  6 100 6 2
    1980 198001  313490  6 100 6 2
    1980 198001  622725  6 100 6 2
    1980 198001  320060  6 100 6 2
    1980 198001  536060  6 100 6 2
    1980 198001  262425  6 100 6 2
    1980 198001  560500  6 100 6 2
    1980 198001  403735  6 100 6 2
    1980 198001  534740  6 100 6 2
    1980 198001  458420  6 100 6 2
    1980 198001  482070  6 100 6 2
    1980 198001  233955  6 100 6 2
    1980 198001  469460  6 100 6 2
    1980 198001  277175  6 100 6 2
    1980 198001  222360  6 100 6 2
    1980 198001  651830  6 100 6 2
    1980 198001  257500  6 100 6 2
    1980 198001  479135  6 100 6 2
    1980 198001 4638620 55 100 6 2
    1980 198001  302475  6 100 6 2
    1980 198001  285170  6 100 6 2
    1980 198001  435205  6 100 6 2
    1980 198001  454460  6 100 6 2
    1980 198001  218245  6 100 6 2
    1980 198001  472965  6 100 6 2
    1980 198001  474895  6 100 6 2
    1980 198001  453865  6 100 6 2
    1980 198001  518795  6 100 6 2
    1980 198001  568390  6 100 6 2
    1980 198001  472865  6 100 6 2
    1980 198001  402915  6 100 6 2
    1980 198001  536060  6 100 6 2
    1980 198001  551895  6 100 6 2
    1980 198001 4052345 48 100 8 2
    1980 198001  739260  8 100 8 2
    1980 198001  767960  9 100 9 2
    1980 198001  764335  9 100 9 2
    1980 198001 4450500 51 100 9 2
    1980 198001 1078995 13 100 9 2
    1980 198001 3313465 39 100 9 2
    end
    label values year YEAR
    label def YEAR 1980 "1980", modify
    label values sample SAMPLE
    label def SAMPLE 198001 "1980 5%", modify
    label values statefip STATEFIP
    label def STATEFIP 1 "alabama", modify
    label def STATEFIP 4 "arizona", modify
    label def STATEFIP 5 "arkansas", modify
    label def STATEFIP 6 "california", modify
    label def STATEFIP 8 "colorado", modify
    label def STATEFIP 9 "connecticut", modify
    label def STATEFIP 12 "florida", modify
    label def STATEFIP 13 "georgia", modify
    label def STATEFIP 21 "kentucky", modify
    label def STATEFIP 24 "maryland", modify
    label def STATEFIP 36 "new york", modify
    label def STATEFIP 37 "north carolina", modify
    label def STATEFIP 39 "ohio", modify
    label def STATEFIP 48 "texas", modify
    label def STATEFIP 51 "virginia", modify
    label def STATEFIP 55 "wisconsin", modify
    label values migplac5 MIGPLAC5
    label def MIGPLAC5 1 "alabama", modify
    label def MIGPLAC5 4 "arizona", modify
    label def MIGPLAC5 5 "arkansas", modify
    label def MIGPLAC5 6 "california", modify
    label def MIGPLAC5 8 "colorado", modify
    label def MIGPLAC5 9 "connecticut", modify
    label values migsamp MIGSAMP
    label def MIGSAMP 2 "person is in migration sample", modify
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 1971 observations
    Use the count() option to list more

  • #2
    gen migplac80= migplac5*2*perwt
    The frequencies appeared to be the same as the original variable, which is not what I expected.
    gen migplac80=migsamp2*2*perwt
    migplac5 is the state identifier and migsamp2 is the sample identifier, as far as I can see. It makes no sense to transform these variables. If "perwt" identifies absolute numbers in the general population (may be an abbreviation for person weight), then the instruction may be as simple as to multiply this variable by 2 in all observations for which migsamp=2. I am not familiar with this dataset, so you should consult others who are to verify whether my assertion is correct.

    Comment


    • #3
      I agree with Andrew that something is odd here. I wonder if the instructions from IPUMS actually were intended to be a modification of a *weight* variable, not a substantive variable? IPUMS is very reputable, but perhaps what was quoted from their documentation is not literally correct. I'm presuming PERWT means "person weight," and multiplying that might make sense in the situation described. I'd contact IPUMS and ask them. I once emailed them about a technical matter and got a quick answer. They also might be happy to know they need to reword their documentation, if my guess about imprecise wording is correct.

      Comment


      • #4
        Hello Andrew,
        Thank you for your response. As you said, "If "perwt" identifies absolute numbers in the general population (may be an abbreviation for person weight), then the instruction may be as simple as to multiply this variable by 2 in all observations for which migsamp=2" I believe that is the case and I was trying to multiply all observations for which migsamp=2 by 2 with my code:
        gen migplac80= migplac5*2*perwt
        (in this case I removed all observations for which migsamp did not equal 2)
        gen migplac80=migsamp*2*perwt

        Is that the correct code to use to multiply the variable by 2? That's the only way I can think to do it, but there is much I still need to learn how to do.

        Thank you!

        Comment


        • #5
          Hello Mike,

          Thank you for your response. I did contact IPUMS to ask about migplac5 variable before posting here initially. I was told, "As noted on the MIGPLAC5 universe tab, MIGPLAC5 is only available for 50% of cases in 1980. Migration, place of work, and travel time to work items were only coded on half of the questionnaires (see page 18 of this 1980 Census report) as these were hand coded from free text responses. Guidance on using the migration/place of work/travel time variables in 1980 is available on the MIGPLAC5 comparability tab as well as the MIGSAMP description tab." From there I found the information stating that I needed to double the variable and multiply by the person weight. I'm just not sure that I'm using the correct code to do that, so I'm trying to determine if the problem is my code, the way the variable is constructed, or something else. But I will certainly contact IPUMS again. They are very helpful as you said. Thank you for your response.

          Best,

          Jennifer Hearst

          Comment


          • #6
            Originally posted by Jennifer Hearst View Post
            Hello Andrew,
            Thank you for your response. As you said, "If "perwt" identifies absolute numbers in the general population (may be an abbreviation for person weight), then the instruction may be as simple as to multiply this variable by 2 in all observations for which migsamp=2" I believe that is the case and I was trying to multiply all observations for which migsamp=2 by 2 with my code:
            gen migplac80= migplac5*2*perwt
            (in this case I removed all observations for which migsamp did not equal 2)
            gen migplac80=migsamp*2*perwt

            Is that the correct code to use to multiply the variable by 2? That's the only way I can think to do it, but there is much I still need to learn how to do.

            Thank you!

            If it is simply an adjustment to the weight variable as Mike and I allude, then the code is

            Code:
            replace pwt= 2*pwt if migsamp==2
            So I would suggest that in contacting the data providers, ask them directly whether the statement

            users should select cases coded as 2 in MIGSAMP and multiply by 2 as well as by PERWT
            implies anything more than multiplying the weight variable by 2. If it does, let them give you one example of an adjustment using an existing variable in the dataset. I am sure that they won't advise that you transform identifiers.

            Comment

            Working...
            X