Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate a variable reporting the autocorrelation between a variable and its first lag over a moving window

    Dear users of Stata,

    I am working with panel data (approx. 1.3 million observations, 18 variables) on daily stock prices for approx. 5,500 firms over a time period of one year. I have encountered the following problem:

    I want to generate a variable with one observation per day and per firm that reports the autocorrelation between the daily stock return and its first lag, over the last 20 trading days (i.e. a moving window). That is, I do not want to generate a variable reporting the autocorrelation of daily stock returns over the entire observation period.

    Unfortunately, I have not been able to find a solution with my own limited skills or in prior posts by other users. Accordingly, I would greatly appreciate any input or advice you might be able to provide.

    Thanks in advance and best,
    Martin

  • #2
    You do not provide any example data to work with, so the following code may not work with your data, but you could modify it accordingly.

    Code:
    clear*
    webuse grunfeld, clear
    xtset company year
    
    gen lag_var = L1.mvalue
    
    capture program drop one_correlation
    program define one_correlation
        corr mvalue lag_var if _N >= 5
        gen moving_lag_corr = r(rho)
        exit
    end
    
    
    rangerun one_correlation, by(company) interval(year -5 -1) verbose
    Note: Requires the -rangerun- command, by Robert Picard, available from SSC.

    The above code assumes your moving window begins with the observation before the index observation and extends back 5 periods from there. Modify the -interval()- option to reflect your actual desires.

    Added: In the future, when asking for help with code, please show data examples, and use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    When asking for help with code, always show example data. When showing example data, always use -dataex-.
    Last edited by Clyde Schechter; 27 Jun 2018, 15:33.

    Comment


    • #3
      Dear Clyde,

      thanks for your quick and very helpful response. The -rangerun- command fulfils all my requirements. Sorry for not showing a data example, I will do so in the future when posting.

      Best, Martin

      Comment


      • #4
        Note that the latest version of rangestat includes a built-in function to calculate the correlation between two variables. With large datasets, this will likely be noticeably faster. The syntax would look like:

        Code:
        rangestat (corr) ret lag_ret,  by(company) interval(day -20 -1)

        Comment


        • #5
          Robert, thank you for reminding me of the (corr) calculation available in -rangestat-. I had forgotten about that. You are quite right that this will run faster with -rangestat- than with the code shown in my earlier response, and also eliminates the need to write a program.

          Comment

          Working...
          X