Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping data

    Dear all,
    Am trying to reshape the data below so that each hhid is represented by a single observation.
    So far, I have tried creating a new "j" variable and use hhid as my "i" in vain. What can I do?
    hhid mem Gender Marital_stat Age b06
    1 4 female Never married 13 10
    1 2 female Married 32 8
    1 3 female Never married 21 6
    1 5 male Never married 18 4
    1 4 male Married 73 2
    1 6 male Never married 8 6
    1 7 female Never married 4 10
    2 3 female Never married 21 4
    2 6 male Never married 15 1
    2 4 male Never married 18 2
    2 5 female Never married 11 6
    2 7 male Never married 9 9
    2 2 female Married 40 1
    2 8 male Never married 7 10
    2 1 male Married 50 0
    3 2 female Married 28 4
    3 5 male Never married 1 6
    3 4 female Never married 5 2
    3 3 male Never married 12 5
    3 1 male Married 39 2
    4 4 male Never married 2 1
    4 2 female Married 28 2
    4 1 male Married 30 4
    4 3 male Never married 4 9
    5 2 female Married 22 9
    5 3 female Never married 1 11

  • #2
    Co Ar:

    Is that your full real name? http://www.statalist.org/forums/help#realnames

    Please provide a data example using dataex (SSC) and also the reshape command you tried.
    http://www.statalist.org/forums/help#stata

    Note that you have a problem with duplicates. You need to fix that before you can even think about reshape.

    Code:
    1 4 female Never married 13 10  
    1 4 male Married 73 2

    Comment


    • #3
      Nick Cox Yes, that's my name. The household has many members such that each member is represented, for example, in hhid 1 we have 7 members. Maybe I force drop the others so that the duplicates don't appear.

      Comment


      • #4
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str2 mem byte(Gender Marital_stat Age b06) int hhid
        "5" 1 7 18  4  1
        "4" 0 7 13 10  1
        "2" 0 1 32  8  1
        "3" 0 7 21  6  1
        "6" 1 7  8  6  1
        "1" 1 1 73  2  1
        "7" 0 7  4 10  1
        "7" 1 7  9  9  2
        "3" 0 7 21  4  2
        "1" 1 1 50  0  2
        "4" 1 7 18  2  2
        "8" 1 7  7 10  2
        "2" 0 1 40  1  2
        "6" 1 7 15  1  2
        "5" 0 7 11  6  2
        "5" 1 7  1  6  3
        "2" 0 1 28  4  3
        "4" 0 7  5  2  3
        "1" 1 1 39  2  3
        "3" 1 7 12  5  3
        "2" 0 1 28  2  4
        "1" 1 1 30  4  4
        "3" 1 7  4  9  4
        "4" 1 7  2  1  4
        "3" 0 7  1 11  5
        "2" 0 1 22  9  5
        "1" 1 1 27  3  5
        "2" 0 1 42  3  6
        "3" 1 7  4  6  6
        "1" 1 1 47  4  6
        "3" 1 7 99 10  7
        "2" 0 1 26 10  7
        "1" 1 1 33  3  7
        "1" 1 1 25  0  8
        "4" 1 7  1  1  8
        "2" 0 1 20  7  8
        "3" 1 7  3  6  8
        "4" 0 7  3  5  9
        "2" 1 1 32  5  9
        "3" 1 7 10  6  9
        "1" 0 1 27  6  9
        "2" 1 1 29  7 10
        "1" 0 1 29  6 10
        "3" 1 7  1  5 10
        "1" 0 1 29 11 11
        "5" 0 7  9  6 11
        "7" 1 7  1  5 11
        "2" 1 1 39  0 11
        "4" 1 7 12  1 11
        "3" 1 7 13  5 11
        "6" 1 7  8  2 11
        "3" 1 7  5  4 12
        "1" 0 1 25  3 12
        "2" 1 1 28  7 12
        "4" 0 7 99  6 12
        "4" 1 7  4  8 13
        "2" 1 1 29  4 13
        "5" 1 7  1  1 13
        "3" 0 7 17  2 13
        "1" 0 1 26  1 13
        "4" 1 7  3  4 14
        "1" 0 5 24  8 14
        "2" 0 5 52  1 14
        "3" 0 7 17  3 14
        "5" 0 7  2  2 14
        "3" 0 7  8  0 15
        "4" 1 7  3  5 15
        "1" 1 1 37  5 15
        "5" 1 7  3  5 15
        "2" 0 1 34  3 15
        "2" 0 1 23  4 16
        "1" 1 1 24  9 16
        "3" 0 7  7  0 16
        "4" 0 7  1  1 16
        "4" 1 7  2  4 17
        "2" 0 1 23  6 17
        "1" 1 1 31  8 17
        "3" 1 7 30  0 17
        "3" 1 7  4  3 18
        "2" 0 1 24  1 18
        "1" 1 1 28  2 18
        "3" 0 7  2  9 19
        "1" 1 1 23  0 19
        "2" 0 1 20  3 19
        "2" 0 1 66  6 20
        "3" 1 7 17  3 20
        "1" 0 4 40  0 20
        "4" 0 7  2  6 20
        "6" 0 7 13  1 21
        "5" 0 7 12  0 21
        "7" 1 7 10  0 21
        "3" 0 7  2  6 21
        "1" 0 4 56  0 21
        "4" 1 7  4  9 21
        "2" 0 4 31  1 21
        "4" 0 7  1  7 22
        "3" 1 7  4  0 22
        "1" 0 1 21 11 22
        "2" 1 1 26  0 22
        "3" 1 7 12  1 23
        end
        label values Gender b02
        label def b02 0 "female", modify
        label def b02 1 "male", modify
        label values Marital_stat b04
        label def b04 1 "Married – monogamous", modify
        label def b04 4 "Divorced", modify
        label def b04 5 "Separated", modify
        label def b04 7 "Never married", modify

        I generated a newID=_n and used it as my j.
        reshape wide Gender, i(hhid) j(newID)
        Last edited by Co Ar; 01 Sep 2016, 07:36.

        Comment


        • #5
          Dropping duplicates may or may not be the answer. The two people labelled 1 4 are clearly different people. Perhaps you can just relabel people in the same household. Depends if other variables refer to the identifiers.
          Last edited by Nick Cox; 01 Sep 2016, 08:23.

          Comment


          • #6
            OK. However, I really would like guidance on how to do this, am a beginner in stata

            Comment


            • #7
              The total sample size from that population should be 354; when doing descriptive s for example mean age will be exaggerated. This is what I want to avoid.

              Comment


              • #8
                I don't see what extra guidance we can give unless you tell us more. See the duplicates command for various tools for dealing with duplicates.

                Comment


                • #9
                  Ok, thanks Nick. Will follow the advice

                  Comment


                  • #10
                    Besides the correct advice that you should have a look at duplicates in your data, -reshape- works perfectly on your example data provided in post #4. I guess (and I have to guess, because you do not tell us what really happens, that's Nick's point in "tell us more"!) your command
                    Code:
                    reshape wide Gender, i(hhid) j(newID)
                    issues an error like
                    Code:
                    variable Age not constant within hhid
                    variable b06 not constant within hhid
                        Your data are currently long.  You are performing a reshape wide.  You typed something like
                    
                            . reshape wide a b, i(hhid) j(mem)
                    
                        There are variables other than a, b, hhid, mem in your data.  They must be constant within hhid because that is the only way they can fit into wide data without loss of
                        information.
                    
                        The variable or variables listed above are not constant within hhid.  Perhaps the values are in error.  Type reshape error for a list of the problem observations.
                    
                        Either that, or the values vary because they should vary, in which case you must either add the variables to the list of xij variables to be reshaped, or drop them.
                    In this error message, Stata exactly tells you what it is puzzled about: There is variation in the other 3 variables (Marital_stat, Age and b06) for each hhid. You have to either drop these variables before reshaping, or also reshape them:
                    Code:
                    reshape wide Gender Marital_stat Age b06 , i(hhid) j(mem) string
                    Note that you do not even need to create a variable newID, because mem already contains all you need, at least in the example data provided. As it is string, the option string has to be added to the reshape command. Please read the help file for reshape to understand the problem.

                    Regards
                    Bela

                    Comment


                    • #11
                      Thanks@ Daniel Bela. The info is really useful.

                      Comment

                      Working...
                      X