Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to reshape the panel data rows to columns

    Dear Clyde Schechter and other Stata experts,

    I have the raw data as follows:

    Code:
    input str10 A str17(IM IP)
    "Code"       "DISA.SI(RI)~U$"    "SUNR.SI(RI)~U$"  
    "  1/1/1998" "20.29"             "47.43"          
    "  1/8/1998" "17.06"             "28.04"          
    " 1/15/1998" "15.81"             "42.54"          
    " 1/22/1998" "15.82"             "42.56"          
    " 1/29/1998" "16.15"             "43.44"          
    "  2/5/1998" "24.41"             "45.14"          
    " 2/12/1998" "23.77"             "63.54"          
    " 2/19/1998" "22.6"              "64.55"          
    " 2/26/1998" "26.52"             "64.63"          
    "  3/5/1998" "21.5"              "63.61"          
    " 3/12/1998" "21.09"             "64.70999999999999"
    " 3/19/1998" "22.07"             "65.3"            
    " 3/26/1998" "23.01"             "62.59"          
    "  4/2/1998" "18.72"             "61.53"          
    "  4/9/1998" "18.3"              "39.24"          
    " 4/16/1998" "16.59"             "65.44"          
    " 4/23/1998" "16.73"             "50.3"            
    " 4/30/1998" "16.79"             "50.47"          
    "  5/7/1998" "14.8"              "38.41"          
    " 5/14/1998" "13.76"             "48.24"          
    " 5/21/1998" "15.49"             "48.88"          
    " 5/28/1998" "12.1"              "41.76"          
    "  6/4/1998" "11.26"             "42.94"          
    " 6/11/1998" "10.18"             "35.84"          
    " 6/18/1998" "11.38"             "37.41"          
    " 6/25/1998" "10.65"             "36.75"          
    "  7/2/1998" "9.66"              "47.02"          
    "  7/9/1998" "10.32"             "43.63"          
    " 7/16/1998" "9.060000000000001" "40.04"          
    " 7/23/1998" "8.140000000000001" "42.27"          
    " 7/30/1998" "7.37"              "42.1"            
    "  8/6/1998" "8"                 "41.55"          
    " 8/13/1998" "7.25"              "37"              
    " 8/20/1998" "7.98"              "36.99"          
    " 8/27/1998" "7.88"              "36.52"          
    "  9/3/1998" "7.94"              "35.34"          
    " 9/10/1998" "8.050000000000001" "35.83"          
    " 9/17/1998" "8.82"              "36"              
    " 9/24/1998" "7.37"              "39.1"            
    " 10/1/1998" "7.5"               "39.78"          
    " 10/8/1998" "6.91"              "35.27"          
    "10/15/1998" "8.66"              "40.14"          
    "10/22/1998" "9.380000000000001" "40.67"          
    "10/29/1998" "11.73"             "37.49"          
    " 11/5/1998" "12.5"              "47.83"          
    "11/12/1998" "10.69"             "51.39"          
    "11/19/1998" "11.71"             "54.14"          
    "11/26/1998" "11.5"              "53.21"          
    " 12/3/1998" "10.78"             "56.54"          
    "12/10/1998" "11.66"             "57.1"            
    "12/17/1998" "11.58"             "44.11"          
    "12/24/1998" "10.77"             "43.94"          
    "12/31/1998" "10.74"             "43.84"          
    "  1/7/1999" "11.37"             "43.29"          
    " 1/14/1999" "13.56"             "54.56"          
    " 1/21/1999" "13.64"             "47.16"          
    " 1/28/1999" "11.98"             "45.85"          
    "  2/4/1999" "12"                "39.04"          
    " 2/11/1999" "12"                "42.86"          
    " 2/18/1999" "13.4"              "42.54"          
    " 2/25/1999" "12.47"             "37.43"          
    "  3/4/1999" "13.13"             "37.21"          
    " 3/11/1999" "13.13"             "37.2"            
    " 3/18/1999" "12.47"             "37.42"          
    " 3/25/1999" "13.93"             "41.89"          
    "  4/1/1999" "13.19"             "37.37"          
    "  4/8/1999" "14.64"             "41.82"          
    " 4/15/1999" "14.96"             "55.7"            
    " 4/22/1999" "22.97"             "55.95"          
    " 4/29/1999" "20.15"             "51.78"          
    "  5/6/1999" "23.08"             "53.94"          
    " 5/13/1999" "22.27"             "46.95"          
    " 5/20/1999" "22.81"             "52.55"          
    " 5/27/1999" "26.33"             "74.63"          
    "  6/3/1999" "30.08"             "79.34"          
    " 6/10/1999" "35.43"             "78.33"          
    " 6/17/1999" "55.08"             "83.54000000000001"
    " 6/24/1999" "62.5"              "91.84999999999999"
    "  7/1/1999" "72.31999999999999" "108.02"          
    "  7/8/1999" "55.18"             "81.41"          
    " 7/15/1999" "44.74"             "57.82"          
    " 7/22/1999" "41.01"             "56.3"            
    " 7/29/1999" "42.76"             "59.69"          
    "  8/5/1999" "39.35"             "55.59"          
    " 8/12/1999" "34.06"             "52.51"          
    " 8/19/1999" "36.99"             "55.45"          
    " 8/26/1999" "47.15"             "65.67"          
    "  9/2/1999" "42.84"             "56.74"          
    "  9/9/1999" "42.77"             "58.18"          
    " 9/16/1999" "41.89"             "58"              
    " 9/23/1999" "41.37"             "54.27"          
    " 9/30/1999" "37.23"             "53.18"          
    " 10/7/1999" "40.71"             "55.38"          
    "10/14/1999" "39.97"             "53.86"          
    "10/21/1999" "37.92"             "51.07"          
    "10/28/1999" "36.45"             "50.36"          
    " 11/4/1999" "40.27"             "55.82"          
    "11/11/1999" "46.1"              "60.92"          
    "11/18/1999" "45.49"             "59.56"          
    end
    Th
    e raw date is exported from Datastream in a form of row, the first column is Date, the sencond column is stock price of firm 1, the thirst column is for firm 2 and so on.....
    I need to format the raw data. In details, I need to transpose the panel data to the form that all firms are in one column, and then the next column is Date. But I do not know how to do. Could you please help?

    Mr @Cylde gave me the code in previous topic like this


    rename A str_date
    foreach v of varlist IM-IP {
    rename `v' price_`=strtoname(`v'[1])'
    }
    drop in 1
    destring price_*, replace
    gen date = daily(str_date, "MDY")
    assert missing(date) == missing(str_date)
    format date %td
    drop str_date

    reshape long price_, i(date) j(firm) string
    rename price_ price



    Here is my results,

    Code:
     foreach v of varlist B-ADE{
      2. rename `v' price_`=strtoname(`v'[1]0)'
      3. }
    B ambiguous abbreviation
    r(111);
    I guess that the problem is when I post the raw data, I choose the firms randomly, which is located at column IM and IP. So, Mr Clyde though I have only 2 firms. Actually, I have over 1000 fims, which is from column B to ADE.

    I look forward to hearing from you.
    Thank you very much in advance.
    Regards,
    Celine

  • #2
    It is hard for me to imagine how you are getting that particular error. Even though your data extend from columns B through ADE in some spreadsheet, B alone should not be an ambiguous name: even if there are many variables that begin with B, presumably there is only one column in the spreadsheet called B. As your example data does not reproduce the problem you are having, I don't know how to troubleshoot this. I suggest you post back with a new data example which actually reproduces this error so I can see what is going on. It would probably also be helpful if you showed the output of -describe B*-.

    One other thing: there is a 0 in the line numbered 2 that does not belong there and will also cause you to get error messages. You may as well eliminate that now, once and for all.

    Comment


    • #3
      Clyde Schechter ,

      Very kind of you. Thank you very much for so quick response.

      I post my raw data here


      Code:
      input str10 D str17 E str45 W str17 GD
      "Code"       "TISE.SI(RI)~U$" ""                                              "ADSA.SI(RI)~U$"  
      "  1/1/1998" "NA"             "$$ER: E100,INVALID CODE OR EXPRESSION ENTERED" "98.28"          
      "  1/8/1998" "NA"             ""                                              "68"              
      " 1/15/1998" "NA"             ""                                              "92.62000000000001"
      " 1/22/1998" "NA"             ""                                              "99.59"          
      " 1/29/1998" "NA"             ""                                              "108.7"          
      "  2/5/1998" "NA"             ""                                              "160.63"          
      " 2/12/1998" "NA"             ""                                              "137.17"          
      " 2/19/1998" "NA"             ""                                              "151.36"          
      " 2/26/1998" "NA"             ""                                              "167.3"          
      "  3/5/1998" "NA"             ""                                              "138.08"          
      " 3/12/1998" "NA"             ""                                              "148.72"          
      " 3/19/1998" "NA"             ""                                              "150.83"          
      " 3/26/1998" "NA"             ""                                              "165.55"          
      "  4/2/1998" "NA"             ""                                              "149.99"          
      "  4/9/1998" "NA"             ""                                              "149.21"          
      " 4/16/1998" "NA"             ""                                              "140.52"          
      " 4/23/1998" "NA"             ""                                              "140.23"          
      " 4/30/1998" "NA"             ""                                              "132.26"          
      "  5/7/1998" "NA"             ""                                              "107.11"          
      " 5/14/1998" "NA"             ""                                              "98.48"          
      " 5/21/1998" "NA"             ""                                              "97.55"          
      " 5/28/1998" "NA"             ""                                              "83.62000000000001"
      "  6/4/1998" "NA"             ""                                              "72.90000000000001"
      " 6/11/1998" "NA"             ""                                              "58.36"          
      " 6/18/1998" "NA"             ""                                              "54.35"          
      " 6/25/1998" "NA"             ""                                              "50.09"          
      "  7/2/1998" "NA"             ""                                              "50.76"          
      "  7/9/1998" "NA"             ""                                              "48.21"          
      " 7/16/1998" "NA"             ""                                              "51.52"          
      " 7/23/1998" "NA"             ""                                              "46.6"            
      " 7/30/1998" "NA"             ""                                              "44.99"          
      "  8/6/1998" "NA"             ""                                              "41.26"          
      " 8/13/1998" "NA"             ""                                              "29.99"          
      " 8/20/1998" "NA"             ""                                              "36.61"          
      " 8/27/1998" "NA"             ""                                              "33.05"          
      "  9/3/1998" "NA"             ""                                              "30.54"          
      " 9/10/1998" "NA"             ""                                              "39.4"            
      " 9/17/1998" "NA"             ""                                              "49.13"          
      " 9/24/1998" "NA"             ""                                              "51.41"          
      " 10/1/1998" "NA"             ""                                              "49"              
      " 10/8/1998" "NA"             ""                                              "49.44"          
      "10/15/1998" "NA"             ""                                              "56.06"          
      "10/22/1998" "NA"             ""                                              "62.97"          
      "10/29/1998" "NA"             ""                                              "70.67"          
      " 11/5/1998" "NA"             ""                                              "81.38"          
      "11/12/1998" "NA"             ""                                              "80.99"          
      "11/19/1998" "NA"             ""                                              "83.58"          
      "11/26/1998" "NA"             ""                                              "94.94"          
      " 12/3/1998" "NA"             ""                                              "88.48"          
      "12/10/1998" "NA"             ""                                              "87.81999999999999"
      "12/17/1998" "NA"             ""                                              "67.52"          
      "12/24/1998" "NA"             ""                                              "63.85"          
      "12/31/1998" "NA"             ""                                              "64.84"          
      "  1/7/1999" "NA"             ""                                              "79.66"          
      " 1/14/1999" "NA"             ""                                              "83.63"          
      " 1/21/1999" "NA"             ""                                              "80.41"          
      " 1/28/1999" "NA"             ""                                              "73.59"          
      "  2/4/1999" "NA"             ""                                              "75.19"          
      " 2/11/1999" "NA"             ""                                              "70.38"          
      " 2/18/1999" "NA"             ""                                              "66.2"            
      " 2/25/1999" "NA"             ""                                              "64.52"          
      "  3/4/1999" "NA"             ""                                              "64.87000000000001"
      " 3/11/1999" "NA"             ""                                              "66.64"          
      " 3/18/1999" "NA"             ""                                              "63.43"          
      " 3/25/1999" "NA"             ""                                              "64.83"          
      "  4/1/1999" "NA"             ""                                              "64.06"          
      "  4/8/1999" "NA"             ""                                              "76.95999999999999"
      " 4/15/1999" "NA"             ""                                              "77.89"          
      " 4/22/1999" "NA"             ""                                              "86.65000000000001"
      " 4/29/1999" "NA"             ""                                              "79.93000000000001"
      "  5/6/1999" "NA"             ""                                              "86.34"          
      " 5/13/1999" "NA"             ""                                              "83.87000000000001"
      " 5/20/1999" "NA"             ""                                              "84.58"          
      " 5/27/1999" "NA"             ""                                              "89.12000000000001"
      "  6/3/1999" "NA"             ""                                              "85.77"          
      " 6/10/1999" "NA"             ""                                              "94.29000000000001"
      " 6/17/1999" "NA"             ""                                              "100.2"          
      " 6/24/1999" "NA"             ""                                              "100.88"          
      "  7/1/1999" "NA"             ""                                              "150.9"          
      "  7/8/1999" "NA"             ""                                              "131.15"          
      " 7/15/1999" "NA"             ""                                              "128.95"          
      " 7/22/1999" "NA"             ""                                              "126.75"          
      " 7/29/1999" "NA"             ""                                              "134.13"          
      "  8/5/1999" "NA"             ""                                              "133.08"          
      " 8/12/1999" "NA"             ""                                              "133.85"          
      " 8/19/1999" "NA"             ""                                              "140.18"          
      " 8/26/1999" "NA"             ""                                              "143.39"          
      "  9/2/1999" "NA"             ""                                              "135.86"          
      "  9/9/1999" "NA"             ""                                              "136.38"          
      " 9/16/1999" "NA"             ""                                              "120.53"          
      " 9/23/1999" "NA"             ""                                              "120.49"          
      " 9/30/1999" "NA"             ""                                              "127.35"          
      " 10/7/1999" "NA"             ""                                              "125.22"          
      "10/14/1999" "NA"             ""                                              "114.07"          
      "10/21/1999" "NA"             ""                                              "95.22"          
      "10/28/1999" "NA"             ""                                              "89.33"          
      " 11/4/1999" "NA"             ""                                              "106.68"          
      "11/11/1999" "NA"             ""                                              "110.59"          
      "11/18/1999" "NA"             ""                                              "107.18"          
      end
      Because I cannot attach a .dta file here. So, I choose three compaies with different kinds of data. Then, I run the code you gave, I got the error as follows,

      Code:
      rename D str_date
      
      . foreach v of varlist E-ADH {
        2.     rename `v' price_`=strtoname(`v'[1])'
        3. }
      price_ already defined
      I have check data, I guess the error starts from the column W when it is missing the Code of company and the value of it is "$$ER....". In my dataset, there are some column like this.

      I try doing:

      Code:
      . foreach v of varlist E-ADH {
        2.     drop `v' if `v'==""
        3. }
      invalid syntax
      So, could you please help?

      Thank you very much in advance.

      Regards,
      Anh
      Last edited by Celine Tran; 24 Feb 2020, 22:54.

      Comment


      • #4
        The $ sign followed by text is probably being taken as a reference to a global macro that does not exist. A crude way to trap this is


        Code:
        foreach v of varlist E-ADH {    
             capture rename `v' price_`=strtoname(`v'[1])'    
             if _rc rename `v' price_`v'  
        }
        Last edited by Nick Cox; 25 Feb 2020, 02:39.

        Comment


        • #5
          Dear Nick Cox

          The code you gave me works through. However, when I reshape the the panel data following the code at #1, there is something wrong.

          I will show you what I did

          Code:
          rename A str_date
          
          foreach v of varlist E-ADH { capture rename `v' price_`=strtoname(`v'[1])' if _rc rename `v' price_`v' }
          drop in 1 destring price_*, replace price_TISE_SI_RI__U_ contains nonnumeric characters; no replace price_GPRL_SI_RI__U_ contains nonnumeric characters; no replace price_MSML_SI_RI__U_ contains nonnumeric characters; no replace price_BCIL_SI_RI__U_ contains nonnumeric characters; no replace price_SINE_SI_RI__U_ contains nonnumeric characters; no replace price_ESGL_SI_RI__U_ contains nonnumeric characters; no replace price_ASIV_SI_RI__U_ contains nonnumeric characters; no replace price_LETL_SI_RI__U_ contains nonnumeric characters; no replace price_YAMR_SI_RI__U_ contains nonnumeric characters; no replace price_MAPI_SI_RI__U_ contains nonnumeric characters; no replace price_MSEL_SI_RI__U_ contains nonnumeric characters; no replace price_ANIH_SI_RI__U_ contains nonnumeric characters; no replace price_OXHL_SI_RI__U_ contains nonnumeric characters; no replace price_NRGL_SI_RI__U_ contains nonnumeric characters; no replace price_ contains nonnumeric characters; no replace price_MEWI_SI_RI__U_ contains nonnumeric characters; no replace price_ZBRG_SI_RI__U_ contains nonnumeric characters; no replace price_XMHL_SI_RI__U_ contains nonnumeric characters; no replace price_W contains nonnumeric characters; no replace gen date = daily(str_date, "MDY") assert missing(date) == missing(str_date) format date %td drop str_date reshape long price_, i(date) j(firm) string (note: j = AA AAE AAGH_SI_RI__U_ AAIK_SI_RI__U_ AAJ AAT AAT_AX_A16_RI__U_ A > BD ABFB_SI_RI__U_ ABRH_SI_RI__U_ ABS ABTR_SI_RI__U_ ABUN_SI_RI__U_ ACA AC > CO_SI_RI__U_ ACCR_SI_RI__U_ ACEA_SI_RI__U_ ACES_SI_RI__U_ ACMA_SI_RI__U_ > ACQ ACR ACRO_SI_RI__U_ ACTA_SI_J15_RI__U_ ACV ACX ADAM_SI_F12_RI__U_ ADH > ADSA_SI_RI__U_ ADVA_SI_RI__U_ ADVE_SI_RI__U_ ADVI_SI_E17_RI__U_ ADVN_SI_R > I__U_ AEHL_SI_RI__U_ SI_RI__U_ YNAM_SI_RI__U_ YNLG_SI_RI__U_ YNMH_SI_RI__U_ YOMA_SI_RI__U_ YOR > K_SI_RI__U_ YVEN_SI_RI__U_ ZBRG_SI_RI__U_ ZE ZHON_SI_RI__U_ ZICO_SI_RI__U > _ ZR _1421_HK_RI__U_ _1570_HK_RI__U_) price_ already defined
          So, what should I do? I think I should drop column with "$$ER...." before using the foreach command that you suggest. Could you please help?

          Thank you.

          Regrads,
          Anh

          Comment


          • #6
            It's the first observation (row) not any variable (column) that looks wrong.

            Comment


            • #7
              Dear Nick Cox

              Thank you very much for your reply. Hnoestly, I donot mean your code is incorrect. Let me show you my data which is collected from Datastrem

              Code:
              input str10 str_date str17 E str45 W str17 GI str45 GM
              "Code"       "TISE.SI(RI)~U$" ""                                              "SECD.SI(RI)~U$" ""                                            
              "  1/1/1998" "NA"             "$$ER: E100,INVALID CODE OR EXPRESSION ENTERED" "18.35"          "$$ER: E100,INVALID CODE OR EXPRESSION ENTERED"
              "  1/8/1998" "NA"             ""                                              "16.72"          ""                                            
              " 1/15/1998" "NA"             ""                                              "17.56"          ""                                            
              " 1/22/1998" "NA"             ""                                              "18.22"          ""                                            
              " 1/29/1998" "NA"             ""                                              "17.26"          ""                                            
              "  2/5/1998" "NA"             ""                                              "22.08"          ""                                            
              " 2/12/1998" "NA"             ""                                              "22.89"          ""                                            
              " 2/19/1998" "NA"             ""                                              "23.26"          ""                                            
              " 2/26/1998" "NA"             ""                                              "26.82"          ""                                            
              "  3/5/1998" "NA"             ""                                              "23.62"          ""                                            
              " 3/12/1998" "NA"             ""                                              "21.2"           ""                                            
              " 3/19/1998" "NA"             ""                                              "24.24"          ""                                            
              " 3/26/1998" "NA"             ""                                              "25.84"          ""                                            
              "  4/2/1998" "NA"             ""                                              "25.4"           ""                                            
              "  4/9/1998" "NA"             ""                                              "25.91"          ""                                            
              " 4/16/1998" "NA"             ""                                              "25.72"          ""                                            
              " 4/23/1998" "NA"             ""                                              "25.95"          ""                                            
              " 4/30/1998" "NA"             ""                                              "26.04"          ""                                            
              "  5/7/1998" "NA"             ""                                              "22.55"          ""                                            
              end

              I choose randomly some colums from may data. As you can see, there are two similar columns (W and GM). Actually, in my data, a lot of columns like this. I though I should delete these columns before using the code that you suggest. I did that already, however the following code to reshape panel does not work through.

              Code:
              rename D str_date
              
              . foreach v of varlist E-ADH {   
                2.      capture rename `v' price_`=strtoname(`v'[1])'   
                3.      if _rc rename `v' price_1`v' 
                4. }
              
              drop price_1*  
              drop in 1
              destring price_*, replace
              
              reshape long price_, i(date) j(firm) string
              
              (note: j = AAGH_SI_RI__U_ AAIK_SI_RI__U_ AAT_AX_A16_RI__U_ ABFB_SI_RI__U_ A
              variable price_AAT_AX_A16_RI__U_ type mismatch with other price_ variables

              So, do you have any suggestion for me?

              Thank you.
              Last edited by Celine Tran; 26 Feb 2020, 20:04.

              Comment


              • #8
                So, you will have got a message that destring could not convert
                Code:
                  
                 price_AAT_AX_A16_RI__U_
                and the implication is that you need to look inside to find out why.
                Code:
                tab price_AAT_AX_A16_RI__U_ if missing(real(price_AAT_AX_A16_RI__U_))
                There are likely to be non-numeric characters that you specify to destring , ignore(). See also (e.g.) https://www.statalist.org/forums/for...iable-problems

                Comment


                • #9
                  Dear Nick,

                  Thank you very much for your suggestion. I figured out the issue and reshaped my data already.

                  Regards,
                  Celine

                  Comment

                  Working...
                  X