Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Long data? Help. New to Stata

    Hi. I am new to Stata, and coming back to data analysis after 20 years since my undergraduate. I am sure that this is a simple fix, but there is nothing specific in all of the youtubes and manuals - unless I just haven't been able to find it yet!

    I have a data set which measures company performance over a 6 year period that I am want to analyse as panel data. I have 53 firms measured over this period with a total of 378 observations.

    It appears as:

    Year Company A Score C Score R Score Combined Score Q Score ROA
    2013 Acme 5 7 9 21 1 5.6
    2014 Acme
    2015 Acme
    2016 Acme
    2013 Bacme
    2014 Bacme
    2015 Bacme
    How do I fix this structure so that stata recognises the panel, and not 378 individual observataions?

    I have been restrudying reshaping etc, but no luck.

    Do I need to go back and restructure the information in the initial Excel spreadsheet, or is there a command that I could use in Stata?

    Thanks so much in advance!

  • #2
    How do I fix this structure so that stata recognises the panel, and not 378 individual observataions?
    If you mean you would like a data set with one observation per Company, and the scores in the various years spread out side by side, that is done with -reshape wide-. If this is not what you mean, please post back explaining in more detail, or better still, showing, what you want, and don't bother reading the rest of this response.

    Do I need to go back and restructure the information in the initial Excel spreadsheet, or is there a command that I could use in Stata?
    You should never manage your data in Excel. Unless you are just doing these analyses for fun, you need to have an audit trail of all the steps that led to your results. Spreadsheets do not provide that. Statistical programs do. The Stata command that does this, assuming I have properly understood what you are asking for, is -reshape wide-.

    Yes -reshape- can be a difficult command to learn to use. It takes a fair amount of practice, and false starts, to get the hang of it. And if you had posted your data in a more user-friendly format (i.e. using the -dataex- command) I would be tempted to just write the code that would do this for you. But I am glad that I am not tempted, because you probably shouldn't do this in the first place. You will be left saddled with a data set that will be extremely difficult to use for analysis. There are only a small number of commands in Stata that work well with wide data; Stata is specifically oriented and designed to work best with long data, i.e. with exactly the layout you have. So unless you can specifically identify for yourself one or more of those wide-oriented commands that you need to use, you should forget about this reorganization of the data and be thankful that your data came in the more useful long layout to begin with. (Most spreadsheet data seems to come in wide layout, and then to make it usable in Stata, the first thing you have to do after you import it is -reshape long-.)

    If, after careful reflection, you really need to transform this data to wide layout, use -dataex- to post back with an example from your Stata data set, and I will show you how to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Thank you for your reply and help Clyde. My reference to Excel was just going back to the initial spreadsheet where I had initialled inputted the data, and then uploading it again back into Stata. Obviously hopeful that I didn't need to do that.

      My understanding is that my data is already in long format, but I am missing something now to ensure that it is treated as panel data. I will give some more thought to my questions and then post again.

      Comment


      • #4
        Code:
        xtset company year
        will tell Stata that this is panel data. (Substitute the actual Stata variable names for company and year if they differ from that.) From there you can do panel analyses with all of the -xt- commands, and you can also avail yourself of time-series operators like lag, lead, etc, and for those estimation commands that support autoregressive structure, you can do that as well.

        By the way, the -xt- panel data commands will only work with long data.

        Since you are new to Stata I recommend that before you plunge into doing production work, you invest some time getting acquainted with it. Stata comes with excellent PDF documentation as part of your installation. You can access it from the Help menu. Read the Getting Started [GS] and User's Guide [U] volumes. It's a fairly long read, but it will introduce you to the most important commands that everyone needs to know about in order to use Stata productively. You won't remember every detail, but you will learn enough that in most situations you will have an idea which command you need, and you can then turn to the help files for the details of syntax. Since you are planning on doing panel data analysis, after that I suggest you read the introductory parts of the Longitudinal/Panel Data [XT] volume as well to get the lay of that land. The time you spend doing this will be amply repaid.
        Last edited by Clyde Schechter; 25 Mar 2022, 18:58.

        Comment

        Working...
        X