Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to introduce lag time variables in panel data

    Dear members,

    I am struggling in modeling a panel data with lag variables in Stata. I would like some help on how can I add these effects on my panel data.
    Based on my understanding lag variables are gonna be helpful since the effects of my independent variables will be perceived not in the same cross-section, but in the next(s) ones. For instance:
    the effects of a certain independent variable in 2005 (first cross section) will affect 2008´s dependent variables (the next cross section); the effects of 2008 independent variables will only cause an effect in 2011dependent variables and so on. My data is not yearly based, but Triennially. Hence, some questions that I have are:

    1- How can I create lag variables? Lag variables are gonna be new variables?
    2- How can I proceed with this ins Stata (what is the command for this)?
    3- Data in Stata for panel data is settled up monthly, yearly and so on, but I don´t know how to proceed with Triennial data.

  • #2
    You don't need to create new lag variables. Stata has time-series operators which can be used in your modeling commands directly.

    With triennial data, let's say your panel variable is called panel and you have a year variable called year. The first step is to -xtset- your data:

    Code:
    xtset panel year, interval(3)
    That will tell Stata that your data is at three-year intervals, so that the "lagged" values refer to the values 3 years ago. Note that Stata will verify that your data really is panel data with three year intervals, and if it finds data that do not fit the pattern, it will give you an error message and refuse to proceed. If that happens, you have to go back and fix your data (or change your understanding of what your data is, if you had it wrong.)

    With that done you can run commands like
    Code:
    regression_command outcome L.independent_var...
    which will use the lagged value of the independent variable as a predictor to model the outcome according to the regression command.

    I think before you proceed you would be well advised to invest some time in studying how Stata handles panel data. First read -help tsvarlist- to learn how lag (and lead, difference, seasonal difference) operators work. Then open up the [XT] volume of the PDF user-manuals that came with your Stata installation and read the chapter on Introduction to xt commands. Then also read the section (in that same volume) specifically on the -xtset- command. And finally, when you have chosen the particular modeling commands you plan to use, read the corresponding sections on those.

    Comment


    • #3
      Clyde,
      Thanks for the support. I followed your advice about the readings. They were very helpful to clarify my questions in terms of lagged variables. However, I still have some basic questions:

      1- I know that there are four different lag operators: L; F; D & S. Having my data with the characteristics such as the ones that I described, which lag variables (or lag operators) should I use and how can I declare them in Stata?

      2- Could you be more precise with the regression command for Time series using lag operators?

      Thanks

      Comment


      • #4
        The four time-series operators do different things. You specifically asked about lagged values, and L is the operator for that F is the opposite of lag, it gives the forward value. S is seasonal difference, i.e. the difference between the current value and that of the previous time period. D is difference. D is also the difference between the current value and that of the previous period. But there is a distinction between D and S. If we specify S2.x, that gives the current value of x minus the value 2 periods earlier. If we specify D2.x, we get a second-order difference, (xt - xt-1) - (xt-1 - xt-2).

        All of these operators can be applied to any data that has been -xtset- or -tsset- with a time variable by simply writing the corresponding letter, optionally a number if a lag or difference of more than 1 step is wanted, and a period (.) before the variable to which you want to apply it.

        I don't think anyone can be more precise with the regression command because you have not sufficiently described your problem for that purpose. Even if you had, if it is not a problem in my own discipline, epidemiology, I probably would not know how to prescribe which models are appropriate and which are not.

        I think that before you engage in any actual projects, you need to invest time into learning Stata from the beginning. Start with the Getting Started [GS] and User's Guide [U] volumes of the PDF manuals that are installed along with Stata. Those will give you an overview of the Stata approach to and philosophy of data management and analysis and will familiarize you with the commands that pretty much everybody who uses Stata needs to know. Then, given that you are specifically interested in panel data, read the -xt- section of the Longitudinal/Panel Data [XT] volume of the PDF documentation. That will give you a good overview of panel data management and analysis commands. When you have read those sections, you will usually be able to identify which commands are likely to be needed for your particular problems. You can then refer to the help files or manual sections of those commands to review the details of their syntax and exactly how they work.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          You don't need to create new lag variables. Stata has time-series operators which can be used in your modeling commands directly.

          With triennial data, let's say your panel variable is called panel and you have a year variable called year. The first step is to -xtset- your data:

          Code:
          xtset panel year, interval(3)
          That will tell Stata that your data is at three-year intervals, so that the "lagged" values refer to the values 3 years ago. Note that Stata will verify that your data really is panel data with three year intervals, and if it finds data that do not fit the pattern, it will give you an error message and refuse to proceed. If that happens, you have to go back and fix your data (or change your understanding of what your data is, if you had it wrong.)

          With that done you can run commands like
          Code:
          regression_command outcome L.independent_var...
          which will use the lagged value of the independent variable as a predictor to model the outcome according to the regression command.

          I think before you proceed you would be well advised to invest some time in studying how Stata handles panel data. First read -help tsvarlist- to learn how lag (and lead, difference, seasonal difference) operators work. Then open up the [XT] volume of the PDF user-manuals that came with your Stata installation and read the chapter on Introduction to xt commands. Then also read the section (in that same volume) specifically on the -xtset- command. And finally, when you have chosen the particular modeling commands you plan to use, read the corresponding sections on those.
          Hi Clyde,

          Please if the data colelction is unequally spaced how to do that ?

          Thanks

          Comment

          Working...
          X