Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • import data from excel to stata-change the structure of the data

    Hi,

    I have a data file in excel which includes more than 300 firms. Information for each firm include code (for example, LCL SA - highlighed red); Company name; and three variables Dates, PX_BID; PX_ASK. I am struggle to load these data into stata which contains only FOUR mentioned variables. The second comany is appended after the first and so on.

    Any suggestion is greatly appreciated

    Kind regards

    Anh Tuan
    Credit Agricole AIG Allianz
    Start Date 1/12/2017 Start Date 24/09/2016 Start Date 23/03/2018
    End Date 1/06/2018 End Date 24/05/2017 End Date 23/12/2018
    Allegation 2/03/2018 Allegation 24/01/2017 Allegation 23/07/2018
    LCL SA CDS EUR SR 5Y D14 Corp AIG CDS USD SR 5Y D14 Corp ALVGR CDS EUR SR 5Y D14 Corp
    #NAME? #NAME? #NAME? #NAME? #NAME? #NAME?
    Dates PX_BID PX_ASK Dates PX_BID PX_ASK Dates PX_BID PX_ASK
    6/01/2018 50.03 55.13 24/05/2017 66.282 84.23 21/12/2018 30.989 32.567
    05/31/2018 55.56 105.57 23/05/2017 65.885 71.306 20/12/2018 28.597 32.073
    05/30/2018 48.9 55.29 22/05/2017 66.104 71.125 19/12/2018 27.583 30.235
    05/29/2018 47.27 54.32 19/05/2017 66.887 71.352 18/12/2018 28.749 30.768
    05/28/2018 35.46 38.81 18/05/2017 68.411 72.176 17/12/2018 28.063 29.349
    0 17/05/2017 68.196 73.556 14/12/2018 27.824 29.378
    05/25/2018 35.23 38.57 16/05/2017 67.682 73.318 13/12/2018 27.315 29.487
    05/24/2018 33.02 36.29 15/05/2017 68.227 72.711 12/12/2018 28.117 30.637
    05/23/2018 32.26 35.58 12/05/2017 68.388 73.26 11/12/2018 29.966 29.961
    05/22/2018 30.51 33.78 11/05/2017 69.319 73.537 10/12/2018 30.534 27.334
    05/21/2018 30.45 33.67 10/05/2017 71.873 74.476 7/12/2018 30.47 28.739
    0 9/05/2017 74.162 77.306 6/12/2018 30.91 28.282
    05/18/2018 29.2 32.38 8/05/2017 75.201 79.686 5/12/2018 29.826 27.471
    05/17/2018 28.29 31.69 5/05/2017 77.081 80.653 4/12/2018 30.213 27.314
    05/16/2018 27.57 30.8 4/05/2017 78.763 82.623 3/12/2018 28.607 26.311
    05/15/2018 26.68 29.87 3/05/2017 79.315 84.322 30/11/2018 30.456 25.039
    05/14/2018 26.46 29.78 2/05/2017 81.535 84.864 29/11/2018 30.998 23.33
    0 1/05/2017 81.962 86.809 28/11/2018 30.69 22.633
    5/11/2018 26.56 30.21 28/04/2017 82.029 87.225 27/11/2018 30.784 23.634
    5/10/2018 27.76 31.22 27/04/2017 82.038 87.316 26/11/2018 28.92 24.454
    5/09/2018 28.01 31.44 26/04/2017 82.14 87.298 23/11/2018 30.605 24.455

  • #2
    You need to give more information about where exactly your problems are. Saying that you struggle is not enough.
    If you present here a truthfull representation of the structure in your Excel-file, then you need to clean up this structure. It looks like that you have a lot of meta data in the first 6 rows. You should put the variable name in the first row or specify the variable names when you import the Excel-file. In this case, the data must start already in the first row.

    Comment


    • #3
      To expand on Sven-Kristjan's advice, if this is your data, then it appears to be in a mixture of long and wide form. This kind of layout is not unusual for Excel users, but is a real mess for Stata users. Stata can accept variable names in the first row in Excel, or the data can start in the first row. I'm not sure where you getting these price data from, but many standard sources of such data will output the data in a more usable form.

      If your data is really one firm after another, then this might work:
      Code:
      clear
      import excel "D:\temporary files\Book1.xlsx", sheet("Sheet1")
      
      g all=strpos(A,"Alleg")>0
      g coname=substr(B,1,7) if all[_n-1] >0 &  all[_n-1] <. 
      forvalues i=1/50  {
          if all[_n-`i'] ==1 {
          *        continue,break
          }
          replace coname=coname[_n-`i']  if coname=="" & coname[_n-`i']~="" & all[_n-1-`i'] ==1
      }
      drop if A=="" | strpos(A,"Start")>0  | strpos(A,"End")>0  | strpos(A,"Alleg")>0 ///
      | strpos(B,"PX")>0

      Comment

      Working...
      X