Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacking 20 columns to 1 column in stata

    I am using DHS data for my analysis. Since questionnaires were used to collect the data, i have 20 columns representing age of different children in households but I want to only have one column with only the 3-24 age bracket. How do I go about it?

    column 1 (age) Column 2(age) Column 3(age).......... column 20(age)


    i need to stack column 2, 3...20 under column 1....i'll appreciate any assistance
    Last edited by Laura Anne; 24 Jun 2016, 12:14.

  • #2
    Without knowing what the rest of your data looks like, it's hard to advise, but this is probably a job for -reshape-. Let's assume you have a household id variable, call it hhid. And let's assume the variables with the 20 different ages are called age1 through age20. Then it's

    Code:
    reshape long age, i(hhid) j(pid)
    The new variable pid will run from 1 through 20 and will identify which age variable the value in that observation originally came from.

    Comment


    • #3
      Let me explain the data better:
      The top column contains id (hv105_01 hv105_02.......hv105_20) and its the age of household member.
      Each column has 50 rows containing the age in years.
      The age ranges from 1 to 54 years.
      However, I only want the age range 3-24 years
      And since all the columns represent age, I want to have all the columns stacked to one column only

      Comment


      • #4
        hv105_01 hv105_02 hv105_03 hv105_04 hv105_05
        50 . 5 2 10
        53 25 . 23 .
        27 1 . 50 8
        29 3 16 1 .

        Comment


        • #5
          That's a small sample of how the data is...I want only one column with all the variables

          Comment


          • #6
            Well, it's not much different from what I suggested earlier. Either way, you still need an identifier for the household, corresponding to the observations in the original data. If you don't have it, you'll just have to create one. Then you can reshape, and then you can drop the ages outside of your range of interest:

            Code:
            gen hhid = _n // CREATE IDENTIFIER OF HOUSEHOLDS
            reshape long hv105_, i(hhid) j(pid)
            rename hv105_ age
            drop if !inrange(age, 3, 24)
            drop pid // NOT NEEDED, I TAKE IT

            Comment


            • #7
              Dear Clyde,
              Pardon my simple questions, its because I'm new to stata, I have used it for less than 2 months.
              After reshaping, when i try to rename, i get an error message saying ambiguous abbreviation:

              rename hv105_ age
              hv105_ ambiguous abbreviation




              Comment


              • #8
                I can't reproduce that. Here's your example and Clyde's code:

                Code:
                clear 
                input hv105_01    hv105_02    hv105_03    hv105_04    hv105_05
                50    .    5    2    10
                53    25    .    23    .
                27    1    .    50    8
                29    3    16    1    .
                end 
                gen hhid = _n 
                reshape long hv105_, i(hhid) j(pid)
                rename hv105_ age
                It works fine.

                You need to show us the results of describe before the reshape and your exact reshape code.

                Comment


                • #9
                  describe hv105_01

                  storage display value
                  variable name type format label variable label
                  ----------------------------------------------------------------------------------------------------------
                  hv105_01 byte %8.0g HV105_01 age of household members




                  reshape long hv105_ , i (hhid) j (pid)



                  I didn't generate hhid, as i already have an identifier:
                  describe hhid

                  storage display value
                  variable name type format label variable label
                  ----------------------------------------------------------------------------------------------------------
                  hhid str12 %12s case identification




                  Comment


                  • #10
                    when I browse the hv105_ , it has so many missing variables and only about 5 observations but the 'wide dataset' didn't have so many missing variables

                    Comment


                    • #11
                      I said describe because we need to see the names of all your variables. Show us the results of

                      Code:
                      describe
                      It seems possible that you have a single hv105_ and various other hv105_*. How does that happen, if so?

                      Comment


                      • #12

                        . describe

                        Contains data from C:\Users\Naliaka\Desktop\DHS 2014\KEHR70FL.DTA
                        obs: 36,430
                        vars: 1,037 21 Jun 2016 07:51
                        size: 40,255,150
                        ----------------------------------------------------------------------------------------------------------
                        storage display value
                        variable name type format label variable label
                        ----------------------------------------------------------------------------------------------------------
                        hhid str12 %12s case identification
                        hv000 str3 %3s country code and phase
                        hv001 int %8.0g cluster number
                        hv002 int %8.0g household number
                        hv003 byte %8.0g HV003 respondent's line number (answering household questionnaire)
                        hv004 int %8.0g ultimate area unit
                        hv005 long %12.0g household sample weight (6 decimals)
                        hv006 byte %8.0g month of interview
                        hv007 int %8.0g year of interview
                        hv008 int %8.0g date of interview (cmc)

                        hv104_01 byte %8.0g HV104_01 sex of household member
                        hv104_02 byte %8.0g HV104_02 sex of household member
                        hv104_03 byte %8.0g HV104_03 sex of household member
                        hv104_04 byte %8.0g HV104_04 sex of household member
                        hv104_05 byte %8.0g HV104_05 sex of household member
                        hv104_06 byte %8.0g HV104_06 sex of household member
                        hv104_07 byte %8.0g HV104_07 sex of household member
                        hv104_08 byte %8.0g HV104_08 sex of household member
                        hv104_09 byte %8.0g HV104_09 sex of household member
                        hv104_10 byte %8.0g HV104_10 sex of household member
                        hv104_11 byte %8.0g HV104_11 sex of household member
                        hv104_12 byte %8.0g HV104_12 sex of household member
                        hv104_13 byte %8.0g HV104_13 sex of household member
                        hv104_14 byte %8.0g HV104_14 sex of household member
                        hv104_15 byte %8.0g HV104_15 sex of household member
                        hv104_16 byte %8.0g HV104_16 sex of household member
                        hv104_17 byte %8.0g HV104_17 sex of household member
                        hv104_18 byte %8.0g HV104_18 sex of household member
                        hv104_19 byte %8.0g HV104_19 sex of household member
                        hv104_20 byte %8.0g HV104_20 sex of household member
                        hv104_21 byte %8.0g HV104_21 sex of household member
                        hv104_22 byte %8.0g HV104_22 sex of household member
                        hv104_23 byte %8.0g HV104_23 sex of household member
                        hv105_01 byte %8.0g HV105_01 age of household members
                        hv105_02 byte %8.0g HV105_02 age of household members
                        hv105_03 byte %8.0g HV105_03 age of household members
                        hv105_04 byte %8.0g HV105_04 age of household members
                        hv105_05 byte %8.0g HV105_05 age of household members
                        hv105_06 byte %8.0g HV105_06 age of household members
                        hv105_07 byte %8.0g HV105_07 age of household members
                        hv105_08 byte %8.0g HV105_08 age of household members
                        hv105_09 byte %8.0g HV105_09 age of household members
                        hv105_10 byte %8.0g HV105_10 age of household members
                        hv105_11 byte %8.0g HV105_11 age of household members
                        hv105_12 byte %8.0g HV105_12 age of household members
                        hv105_13 byte %8.0g HV105_13 age of household members
                        hv105_14 byte %8.0g HV105_14 age of household members
                        hv105_15 byte %8.0g HV105_15 age of household members
                        hv105_16 byte %8.0g HV105_16 age of household members
                        hv105_17 byte %8.0g HV105_17 age of household members
                        hv105_18 byte %8.0g HV105_18 age of household members
                        hv105_19 byte %8.0g HV105_19 age of household members
                        hv105_20 byte %8.0g HV105_20 age of household members
                        hv105_21 byte %8.0g HV105_21 age of household members
                        hv105_22 byte %8.0g HV105_22 age of household members
                        hv105_23 byte %8.0g HV105_23 age of household members
                        hv121_01 byte %8.0g HV121_01 member attended school during current school year
                        hv121_02 byte %8.0g HV121_02 member attended school during current school year
                        hv121_03 byte %8.0g HV121_03 member attended school during current school year
                        hv121_04 byte %8.0g HV121_04 member attended school during current school year
                        hv121_05 byte %8.0g HV121_05 member attended school during current school year
                        hv121_06 byte %8.0g HV121_06 member attended school during current school year
                        hv121_07 byte %8.0g HV121_07 member attended school during current school year
                        hv121_08 byte %8.0g HV121_08 member attended school during current school year
                        hv121_09 byte %8.0g HV121_09 member attended school during current school year
                        hv121_10 byte %8.0g HV121_10 member attended school during current school year
                        hv121_11 byte %8.0g HV121_11 member attended school during current school year
                        hv121_12 byte %8.0g HV121_12 member attended school during current school year
                        hv121_13 byte %8.0g HV121_13 member attended school during current school year
                        hv121_14 byte %8.0g HV121_14 member attended school during current school year
                        hv121_15 byte %8.0g HV121_15 member attended school during current school year
                        hv121_16 byte %8.0g HV121_16 member attended school during current school year
                        hv121_17 byte %8.0g HV121_17 member attended school during current school year
                        hv121_18 byte %8.0g HV121_18 member attended school during current school year
                        hv121_19 byte %8.0g HV121_19 member attended school during current school year
                        hv121_20 byte %8.0g HV121_20 member attended school during current school year
                        hv121_21 byte %8.0g HV121_21 member attended school during current school year
                        hv121_22 byte %8.0g HV121_22 member attended school during current school year
                        hv121_23 byte %8.0g HV121_23 member attended school during current school year
                        idxh4_01 byte %8.0g index to household schedule
                        idxh4_02 byte %8.0g index to household schedule
                        idxh4_03 byte %8.0g index to household schedule
                        idxh4_04 byte %8.0g index to household schedule
                        idxh4_05 byte %8.0g index to household schedule
                        idxh4_06 byte %8.0g index to household schedule
                        idxh4_07 byte %8.0g index to household schedule
                        idxh4_08 byte %8.0g index to household schedule
                        idxh4_09 byte %8.0g index to household schedule
                        idxh4_10 byte %8.0g index to household schedule
                        idxh4_11 byte %8.0g index to household schedule
                        idxh4_12 byte %8.0g index to household schedule
                        idxh4_13 byte %8.0g index to household schedule
                        idxh4_14 byte %8.0g index to household schedule
                        idxh4_15 byte %8.0g index to household schedule
                        idxh4_16 byte %8.0g index to household schedule
                        idxh4_17 byte %8.0g index to household schedule
                        idxh4_18 byte %8.0g index to household schedule
                        idxh4_19 byte %8.0g index to household schedule
                        idxh4_20 byte %8.0g index to household schedule
                        idxh4_21 byte %8.0g index to household schedule
                        idxh4_22 byte %8.0g index to household schedule
                        idxh4_23 byte %8.0g index to household schedule
                        hc0_1 byte %8.0g index to household schedule
                        hc0_2 byte %8.0g index to household schedule
                        hc0_3 byte %8.0g index to household schedule
                        hc0_4 byte %8.0g index to household schedule
                        hc0_5 byte %8.0g index to household schedule
                        hc0_6 byte %8.0g index to household schedule
                        hc0_7 byte %8.0g index to household schedule
                        hc16_1 byte %8.0g HC16_1 day of birth
                        hc16_2 byte %8.0g HC16_2 day of birth
                        hc16_3 byte %8.0g HC16_3 day of birth
                        hc16_4 byte %8.0g HC16_4 day of birth
                        hc16_5 byte %8.0g HC16_5 day of birth
                        hc16_6 byte %8.0g HC16_6 day of birth
                        hc16_7 byte %8.0g HC16_7 day of birth
                        hc27_1 byte %8.0g HC27_1 sex
                        hc27_2 byte %8.0g HC27_2 sex
                        hc27_3 byte %8.0g HC27_3 sex
                        hc27_4 byte %8.0g HC27_4 sex
                        hc27_5 byte %8.0g HC27_5 sex
                        hc27_6 byte %8.0g HC27_6 sex
                        hc27_7 byte %8.0g HC27_7 sex
                        hc30_1 byte %8.0g HC30_1 month of birth
                        hc30_2 byte %8.0g HC30_2 month of birth
                        hc30_3 byte %8.0g HC30_3 month of birth
                        hc30_4 byte %8.0g HC30_4 month of birth
                        hc30_5 byte %8.0g HC30_5 month of birth
                        hc30_6 byte %8.0g HC30_6 month of birth
                        hc30_7 byte %8.0g HC30_7 month of birth
                        hc31_1 int %8.0g HC31_1 year of birth
                        hc31_2 int %8.0g HC31_2 year of birth
                        hc31_3 int %8.0g HC31_3 year of birth
                        hc31_4 int %8.0g HC31_4 year of birth
                        hc31_5 int %8.0g HC31_5 year of birth
                        hc31_6 int %8.0g HC31_6 year of birth
                        hc31_7 byte %8.0g HC31_7 year of birth
                        hc32_1 int %8.0g date of birth (cmc)
                        hc32_2 int %8.0g date of birth (cmc)
                        hc32_3 int %8.0g date of birth (cmc)
                        hc32_4 int %8.0g date of birth (cmc)
                        hc32_5 int %8.0g date of birth (cmc)
                        hc32_6 int %8.0g date of birth (cmc)
                        hc32_7 byte %8.0g date of birth (cmc)
                        hmhidx_01 byte %8.0g index to household schedule
                        hmhidx_02 byte %8.0g index to household schedule
                        hmhidx_03 byte %8.0g index to household schedule
                        hmhidx_04 byte %8.0g index to household schedule
                        hmhidx_05 byte %8.0g index to household schedule
                        hmhidx_06 byte %8.0g index to household schedule
                        hmhidx_07 byte %8.0g index to household schedule
                        hmhidx_08 byte %8.0g index to household schedule
                        hmhidx_09 byte %8.0g index to household schedule
                        hmhidx_10 byte %8.0g index to household schedule
                        hmhidx_11 byte %8.0g index to household schedule
                        hmhidx_12 byte %8.0g index to household schedule
                        hmhidx_13 byte %8.0g index to household schedule
                        hmhidx_14 byte %8.0g index to household schedule
                        hmhidx_15 byte %8.0g index to household schedule
                        hmhidx_16 byte %8.0g index to household schedule
                        hmhidx_17 byte %8.0g index to household schedule
                        hmhidx_18 byte %8.0g index to household schedule
                        hmhidx_19 byte %8.0g index to household schedule
                        hmhidx_20 byte %8.0g index to household schedule
                        hmhidx_21 byte %8.0g index to household schedule
                        hmhidx_22 byte %8.0g index to household schedule
                        hmhidx_23 byte %8.0g index to household schedule
                        hml13_12 byte %8.0g HML13_12 net designation number (hmlidx) for 1st net person slept
                        under last night
                        hml13_13 byte %8.0g HML13_13 net designation number (hmlidx) for 1st net person slept
                        under last night
                        hml16_01 byte %8.0g HML16_01 corrected age from individual file
                        hml16_02 byte %8.0g HML16_02 corrected age from individual file
                        hml16_03 byte %8.0g HML16_03 corrected age from individual file
                        hml16_04 byte %8.0g HML16_04 corrected age from individual file
                        hml16_05 byte %8.0g HML16_05 corrected age from individual file
                        hml16_06 byte %8.0g HML16_06 corrected age from individual file
                        hml16_07 byte %8.0g HML16_07 corrected age from individual file
                        hml16_08 byte %8.0g HML16_08 corrected age from individual file
                        hml16_09 byte %8.0g HML16_09 corrected age from individual file
                        hml16_10 byte %8.0g HML16_10 corrected age from individual file
                        hml16_11 byte %8.0g HML16_11 corrected age from individual file
                        hml16_12 byte %8.0g HML16_12 corrected age from individual file
                        hml16_13 byte %8.0g HML16_13 corrected age from individual file
                        hml16_14 byte %8.0g HML16_14 corrected age from individual file
                        hml16_15 byte %8.0g HML16_15 corrected age from individual file
                        hml16_16 byte %8.0g HML16_16 corrected age from individual file
                        hml16_17 byte %8.0g HML16_17 corrected age from individual file
                        hml16_18 byte %8.0g HML16_18 corrected age from individual file
                        hml16_19 byte %8.0g HML16_19 corrected age from individual file
                        hml16_20 byte %8.0g HML16_20 corrected age from individual file
                        hml16_21 byte %8.0g HML16_21 corrected age from individual file
                        hml16_22 byte %8.0g HML16_22 corrected age from individual file
                        hml16_23 byte %8.0g HML16_23 corrected age from individual file
                        hml16a_01 byte %8.0g age in months (for children)
                        hml16a_02 byte %8.0g age in months (for children)
                        hml16a_03 byte %8.0g age in months (for children)
                        hml16a_04 byte %8.0g age in months (for children)
                        hml16a_05 byte %8.0g age in months (for children)
                        hml16a_06 byte %8.0g age in months (for children)
                        hml16a_07 byte %8.0g age in months (for children)
                        hml16a_08 byte %8.0g age in months (for children)
                        hml16a_09 byte %8.0g age in months (for children)
                        hml16a_10 byte %8.0g age in months (for children)
                        hml16a_11 byte %8.0g age in months (for children)
                        hml16a_12 byte %8.0g age in months (for children)
                        hml16a_13 byte %8.0g age in months (for children)
                        hml16a_14 byte %8.0g age in months (for children)
                        hml16a_15 byte %8.0g age in months (for children)
                        hml16a_16 byte %8.0g age in months (for children)
                        hml16a_17 byte %8.0g age in months (for children)
                        hml16a_18 byte %8.0g age in months (for children)
                        hml16a_19 byte %8.0g age in months (for children)
                        hml16a_20 byte %8.0g age in months (for children)
                        hml16a_21 byte %8.0g age in months (for children)
                        hml16a_22 byte %8.0g age in months (for children)
                        hml16a_23 byte %8.0g age in months (for children)

                        * indicated variables have notes

                        Comment


                        • #13
                          Here is my guess. The reshape command gave an error message and left the data unchanged, and hence the 23 hv105_ variables were left unchanged, and hv105_ was ambiguous. You need to run your reshape command and show us the command and the output it produces.

                          There is much else to critique in what you have shown us of your work, but I lack the time at the moment. Let me just say that you should include each of the sets of 23 individual-level variables in your reshape command, thus
                          Code:
                          reshape long hv104_ hv105_ ... hml16a_ , i(hhid) j(pid)
                          Last edited by William Lisowski; 25 Jun 2016, 08:13.

                          Comment


                          • #14
                            Without having said as much, the comment in #10 suggests that the user did succeed in reshaping and found many missing values (not missing variables) in the created variable hv105_.

                            This is to be expected. While there may not have been many missing values for hv105_1 or hv105_2, only households with 23 members will have a nonmissing value for hv105_23. So out of each group of 23 observations for one household, you will expect to see a few nonmissing values corresponding to hv105_1, hv105_2, etc., following by observations with missing values corresponding to the missing values for hv105_21, hv105_22, hv105_23.

                            Comment


                            • #15
                              I realized that I had so many missing values as stata was not recognizing hv105_01....hv105_09. I renamed these variables in the format hv105_1......hv105_9, then ran the command
                              reshape long hv104_ hv105_ hv121_ hv124_ sh18_ sh19l_ sh20a_ hc61_ hc62_, i(hhid) j(pid)
                              and got the output

                              Data wide -> long
                              -----------------------------------------------------------------------------
                              Number of obs. 36430 -> 837890
                              Number of variables 263 -> 98
                              j variable (23 values) -> pid
                              xij variables:
                              hv104_1 hv104_2 ... hv104_23 -> hv104_
                              hv105_1 hv105_2 ... hv105_23 -> hv105_
                              hv121_1 hv121_2 ... hv121_23 -> hv121_
                              hv124_1 hv124_2 ... hv124_23 -> hv124_
                              sh18_1 sh18_2 ... sh18_23 -> sh18_
                              sh19l_1 sh19l_2 ... sh19l_23 -> sh19l_
                              sh20a_1 sh20a_2 ... sh20a_23 -> sh20a_
                              hc61_1 hc61_2 ... hc61_23 -> hc61_
                              hc62_1 hc62_2 ... hc62_23 -> hc62_

                              And when I browsed, it had the results I was expecting.

                              Thank you so much for the assistance.

                              Comment

                              Working...
                              X