Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • General Questions on Panel Data Formatting on Stata

    I am pretty new in Stata and will be using to for my master degree thesis in finance. The data that I am using are panel data and have few general questions to get myself started:

    - My data is edited on a excel file and data for all the variables are in one file. Is it wise to separate out each variables data in different excel file and upload and merge the data or best way to go is just on Stata and append the data ?

    - My data is set up in excel horizontally for each variable. Below example shows one variable. Now I believe in order to run analysis on Stata I need the data to be longitudinal like Years are set up vertically side by side with company name and corresponding data next to them for respective year and do this for each variable. How do I do this formatting on Stata instead of excel ?

    Click image for larger version

Name:	Screen Shot 2564-12-26 at 00.41.06.png
Views:	1
Size:	141.5 KB
ID:	1642329


    - I have seen few tutorials on how Stata treats NA values. Any suggestions on this issue as I have handful of NA values ?

  • #2
    Farhan Hasnat Hey Farhan. Knowing nothing about your topic, I always do my data collection in Stata. So let's say I wanna study American states. I'll copy the fips information from the internet (usually a shapefile, in fact) and use the spatial ID as well as the fips codes as the unique identifiers. From there, say I wanted racial demographics data of some kind, I'll find wherever that dataset exists, copy it into my raw data directory, work with it, and merge it into my master data, the original dataset. You've gathered yours in excel, which isn't a crime or wrong, just not how I'd do it. I'd be more than happy to chat about what I mean if you wanna message me.

    Either way, what you're interested in here is the reshape command. In fact, I strongly suggest you download the user written gtools command, which implements a faster version of the reshape command.

    I can help with the reshape part. Just use dataex to give some of this dataset here, and we'll get this data in long format.

    Comment


    • #3
      Jared Greathouse Hi Jared ! Thank you for you input. To give you more insight my study is about accrual earnings and relative earnings management. The data that I have is download using Datastream. Just to walk you through my work, the first column you see in the picture are basically Ticker Code of the companies which would be my unique identifier. I have four more variable that looks exactly like this sheet.

      At a later stage, I have industry data for the each of the companies which I would need merge with the data as I plan to run my regressions by industry by year.

      I have just installed gtools and will go through documentation/tutorial how to work with it. Thank you for offering help with the reshape part, I am just reading the dataex command. Would have sent you a message you but not really sure how to message in this forum.

      Comment


      • #4
        Cross-posted at https://www.reddit.com/r/stata/comme...formatting_on/

        Comment


        • #5
          Jared Greathouse Below is the sample of my dataset. This dataset is for Net income and I have four more dataset for other financial items. The "TEST1 [IF_LOC]" which is basically the ticker is common among all other dataset which will use to merge the dataset.

          Meanwhile, I have read other sources including the manual and taught myself commands to work on my data. I believe I need to change my year column which are currently only in numeric (ie. 2009) to (yr_2009/Year 2009) to use the reshape command. Any advice from your end would be great.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str14 D long(E F G H I J K L M N O P Q R S T U V W X Y) byte Z
          "TEST1 [IF_LOC]"    1994    1995    1996      1997      1998     1999     2000     2001     2002     2003     2004     2005     2006     2007     2008     2009     2010     2011     2012     2013     2014 .
          "TH:2S"                .       .       .         .         .        .        .        .        .        .        .        .    34020    55390   112840    91864    71431    64980    52643    68031    41898 .
          "TH:AJ"            21495   27384   52941   -616760    121234   -81609   -51199   152305   352870   255651   240919   172836   102073   129639   160309   288052   984924   878600   190007   -97977  -247856 .
          "TH:AH"                .       .       .         .         .        .    22288    19470   132374   310157   763088   710820   380789   350070   241351  -108569   356573  -389745   917220   610706   366960 .
          "TH:ABICO"             .       .       .         .         .        .  -586866   359261  -160907  -209256 -1198139  1983907    39735    69635   -33008    57414    31641    85450   203881   118924   110024 .
          "TH:AA"          -334257  -95536 -912982  -7011903   4768554 -3734238   628910   199010  1195095  1495941  1518619  1909528  1972180   430287        .        .        .        .        .        .        . .
          "TH:ACC"               .       .       .    107182    263704   400757   564352   753127   748555   361896   244596  -373369  -355850    58031    38573   -75021    13084   102965    -8740    48015     8453 .
          end
          Last edited by Farhan Hasnat; 27 Dec 2021, 12:14.

          Comment

          Working...
          X