Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • confusion with lead and lag in time series operators

    Dear Statalisters,

    I don't quite understand where my logic is wrong in the following example, but I would be happy if you could point that out:

    I want to create a lagged variable and regress on that.

    In the helpfile of the "time series operators" it says "For instance, L.gnp refers to the lagged value of variable gnp."
    So what I take from this is that if I want to run a model with a lagged dependent variable I go with "regress variable L.variable"

    However, creating a variable with "L.variable" this actually contains the lead of this variable?

    *minimal working example
    ***************************************
    sysuse uslifeexp, clear

    keep if year < 1903
    keep year le
    tsset year

    gen lead_le = F.le // operator meaning: F. lead (x_t+1)
    gen lag_le = L.le // operator meaning: L. lag (x_t-1)

    list
    **************
    => output

    +-----------------------------------------+
    | year le lead_le lag_le |
    |------------------------------------------|
    1. | 1900 47.3 49.1 . |
    2. | 1901 49.1 51.5 47.3 |
    3. | 1902 51.5 . 49.1 |
    ***************************************

    In the output above, the variable created with L.le contains the values of le in t+1 [and not t-1 as I expected]. Vice versa the variable F.le contains values of le in t-1 [which is the lag I want to use].
    So what I take from this is that if I want to regress a variable on its value in t-1 I would have to use "regress variable F.variable"?

    Am I going wrong here?

    Thanks and best

    Lukas







  • #2
    I think you're just misreading your output. First, let's put it all in a code block so that it's readable and properly aligned (you should do this whenever you post code or results):

    Code:
    . sysuse uslifeexp, clear
    (U.S. life expectancy, 1900-1999)
    
    . 
    . keep if year < 1903
    (97 observations deleted)
    
    . keep year le
    
    . tsset year
            time variable:  year, 1900 to 1902
                    delta:  1 unit
    
    . 
    . gen lead_le = F.le // operator meaning: F. lead (x_t+1) 
    (1 missing value generated)
    
    . gen lag_le = L.le // operator meaning: L. lag (x_t-1)
    (1 missing value generated)
    
    . 
    . list, noobs clean
    
        year     le   lead_le   lag_le  
        1900   47.3      49.1        .  
        1901   49.1      51.5     47.3  
        1902   51.5         .     49.1
    Everything is fine. In year 1900, the value for lead_le is 49.1, which is indeed the 1901 value for le, and the lag_le value is missing, as it should be since 1900 is the first year. Similarly in 1901, the value of lead_le is 51.5, which is indeed the 1902 value for le, and lag_le's value is 47.3, which was the 1900 le value. And so on. It's all correct.

    Comment

    Working...
    X