Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Time variable in Panel Data analysis

    Hi everyone!

    I am using a five quarter longitudinal study (labour force survey UK data) for a panel data fixed effects model. I am finding it a bit difficult to define the time variable (without which the data cannot be declared as panel data) for the dataset as I am not sure of what exactly it is. The dataset is unbalanced and there is no time variable as such, only the week in which the responder was surveyed. Any suggestions on just how to go about defining the dataset would be very helpful.

  • #2
    Welcome to Statalist.

    Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    Even the best descriptions of data are no substitute for an actual example of the data. In order to get a helpful response, you need to show some example data.

    Be sure to use the dataex command to do this. If you are running version 15.1 or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use dataex.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Comment


    • #3
      Samyukta:
      please do act on William's helpful advice.
      That said, even without knowing anything about your data, some comments about your query can be made:
      - it is not manadatory to -xtset- your panel data with a given -timevar- too, unless you plan to use time-series commands such as lags and leads. Hence, Stata panel routines will work even after:
      Code:
      xtset panelid
      - Stata can handle both balanced and unbalanced panel datasets with no problem;
      - be sure that your data are in -long- form; otherwise use -reshape- to convert them from -wide- to -long-.
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        A quick follow-up to Carlo's post: in this instance "panelid" will be your cross-sectional identifier, for you this will be the survey participants' unique identifier code.

        Comment


        • #5
          Mike:
          welcome to this forum.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Mechanically you can proceed as Carlo and Mike suggest, and the computer will not break.

            However understanding the structure of your data is step 0 at any statistical / econometric analysis. And I think it is a big problem if you are analysing panel data while not understanding what variable/or combination of variables defines your time index.

            I believe you should clarify this misunderstanding of your data before you take further steps.

            Comment


            • #7
              Actually, I do sponsor Joro's caveat about being aware of what we're doing when we do it (this recommendation probably holds for many aspects of the human life!).
              As far as panel data regression is concerned, you can get yourself familiar with its building blocks just studying any decent textbook on this topic.
              Stata users dealing with econometrics usually have on their bookshelf the valuable https://www.stata.com/bookstore/micr...metrics-stata/
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                I interpreted post #1 as saying that while the survey was quarterly for five periods, there doesn't exist a "wave" variable taking values 1, 2, 3, 4, 5 to indicate the wave for which the observation was taken. Instead, there are up to five observations per panel, each observation with an interview date, and the question posed was, how do I map the interview date into something suitable for use in xtset? And Carlo's answer in post #3 was to point out that

                define the time variable (without which the data cannot be declared as panel data)
                from post #1 is not correct; all that xtset requires is the definition of a panel variable.

                This is not to distract from the important point Joro makes in post #6 about understanding the structure of the data. If you're going to be using lagged values or differences, you need to understand which waves are present and which are missing, for example.

                But it is my sense that the question is, how do I translate dates into quarters, or waves, or something else suitable for use with xtset. And the point I made in post #2 still holds: we cannot answer that question without knowing more about your data.

                Or .. is it the case that your data actually has a variable identifying the wave of the survey, but you think you actually require a year-and-quarter date? If that's the case, the wave is what you should be using as your "time" variable for xtset.

                You might find reviewing the full documentation for xtset including the Remarks and Examples in the Stata Longitudinal-Data/Panel-Data Reference Manual PDF included with your Stata installation and accessible through Stata's Help menu.

                Comment


                • #9
                  I think you may be thinking of the 'wave' number that you need together with the responding person identifier in order to xtset (tsset) the panel. Is that right Samyukta Venkatraman? Did you solve this problem?

                  Comment


                  • #10
                    So your suggestion is if there is no 'wave' variable, create 'wave' variable which provides regular time breaks between observations, right? @William Lisowski . But the underlying data is irregularly spaced when we look at the date in #1.

                    Comment


                    • #11
                      Nursena:
                      to wrap up what above, a panel dataset is composed of a N (cross-sectional) and T (time-series) dimension.
                      To access -xt- suite in Stata you shoud -xtset- your dataset beforehand:
                      Code:
                      xtset panelis timevar
                      Whenever you have repeated measurements on the same variables retrieved from the (more or less, as attrition can play a role in that) same sample at (theoretically) equally spaced time interval, the all machinery works like a charm.
                      Conversely, if you do not know whether the data wave date back to the same time (more or less) for all the panels, every inference becomes unreliable (and creating waves ex post does not improve the scenario).
                      Kind regards,
                      Carlo
                      (StataNow 18.5)

                      Comment


                      • #12
                        Hi Carlo,

                        I know that wave variable dates back to a couple of weeks as data collection takes some time. So it is not equally spaced for all sample. For example, individual 1 has two interviews at week 5 (wave 1) and week 10 (wave 2) while individual 2 has the same interview at week 3 (wave 1) and week 7 (wave 2). So if I use week variable time variable would not equally spaced. Alternatively if I use wave variable they are equally spaced but then I know that there is 1 week gap. It is a problem for the duration models. Is it also problem for panel data models?

                        Best regards,
                        Nursena
                        (Stata 17.0)

                        Comment


                        • #13
                          Nursena:
                          being that a problem or not mostly depends on the average lag vs. theoretical time of measurement in each wave. Put differently, while a two-week difference might be negigible, 4 weeks apart are probably too much.
                          Kind regards,
                          Carlo
                          (StataNow 18.5)

                          Comment


                          • #14
                            Many thanks Carlo!

                            Comment

                            Working...
                            X