Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compute rolling standard deviation

    Dear Statalisters,
    I am using stata 13 and have a panel data with 1 million observation. Data structure is shown below.

    To compute rolling 60 month standard deviation of returns ,I ran this code but got an error:

    tsset permno date

    quietly rolling sd_ret = r(sd), window(60) step(1) : summarize ret

    error: eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-

    ********************
    permno date cusip ret
    10008 31-Dec-85 36547310
    10008 31-Jan-86 36547310
    10008 28-Feb-86 36547310 0.11965812
    10008 31-Mar-86 36547310 0.083969466
    10008 30-Apr-86 36547310 0.028169014
    10008 30-May-86 36547310 -0.01369863
    10008 30-Jun-86 36547310 -0.048611112
    10008 31-Jul-86 36547310 -0.153284669
    10008 29-Aug-86 36547310 -0.060344826
    10008 30-Sep-86 36547310 -0.155963302
    10008 31-Oct-86 36547310 0.086956523
    10008 28-Nov-86 36547310 0.100000001
    10008 31-Dec-86 36547310 0.090909094
    ​********************

    could you tell me what I did wrong?

    Thanks,
    Rochelle

  • #2
    This is tsset as daily data?

    The first window goes from 12/31/1985 to 2/28/1986 which has 1 observation (2/28/1986) and therefore the stand deviation is missing. The second window goes from 1/1/1986 to 3/1/1986 which again only has one observations and therefore the standard deviation is missing. And so on until the window reaches 3/31/1986.

    The following code works:

    Code:
    clear
    input permno str9 date cusip ret
    10008 31-Dec-85 36547310 .
    10008 31-Jan-86 36547310 .
    10008 28-Feb-86 36547310 0.11965812
    10008 31-Mar-86 36547310 0.083969466
    10008 30-Apr-86 36547310 0.028169014
    10008 30-May-86 36547310 -0.01369863
    10008 30-Jun-86 36547310 -0.048611112
    10008 31-Jul-86 36547310 -0.153284669
    10008 29-Aug-86 36547310 -0.060344826
    10008 30-Sep-86 36547310 -0.155963302
    10008 31-Oct-86 36547310 0.086956523
    10008 28-Nov-86 36547310 0.100000001
    10008 31-Dec-86 36547310 0.090909094
    end
    
    gen d = date(date, "DMY", 1999)
    tsset permno d
    rolling sd_ret = r(sd), clear window(60) step(1) : summarize ret


    This produces the following output. Note the number of observations has increased from 13 to 307 as all missing dates are filled in.

    Code:
    . rolling sd_ret = r(sd), clear window(60) step(1) : summarize ret
    (running summarize on estimation sample)
    
    -> permno = 10008
    
    Rolling replications (307)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
    eeeeeeeeeeeeeeeeeeeeeeeeeeeeeee...................    50
    ..........e.......................................   100
    .....................e............................   150
    .ee...............................................   200
    .............e............................eee.....   250
    ..................................................   300
    .....e.

    Comment


    • #3
      Thank you Scott !!!

      I want to do rolling 60 months.

      I edited your code
      /***
      gen mon = month(date)

      tsset permno mon
      rolling sd_ret = r(sd), clear window(60) step(1) : summarize ret
      ***/

      I do not think I did this right, my guess is I need to tsset by day, not month, but should I change window to number of days, but we could have leap year, how could I specify the days.

      can I still use window (60) for 60 month?

      Thanks,
      Rochelle

      Comment


      • #4
        Here is one way using a 2 month window:

        Code:
        clear
        input permno str9 date cusip ret
        10008 31-Dec-85 36547310 .
        10008 31-Jan-86 36547310 .
        10008 28-Feb-86 36547310 0.11965812
        10008 31-Mar-86 36547310 0.083969466
        10008 30-Apr-86 36547310 0.028169014
        10008 30-May-86 36547310 -0.01369863
        10008 30-Jun-86 36547310 -0.048611112
        10008 31-Jul-86 36547310 -0.153284669
        10008 29-Aug-86 36547310 -0.060344826
        10008 30-Sep-86 36547310 -0.155963302
        10008 31-Oct-86 36547310 0.086956523
        10008 28-Nov-86 36547310 0.100000001
        10008 31-Dec-86 36547310 0.090909094
        end
        
        gen d = date(date, "DMY", 1999)
        g dm=mofd(d)
        
        tsset permno dm
        rolling sd_ret = r(sd), clear window(2) step(1) : summarize ret
        format start end %tm
        l
        Last edited by Scott Merryman; 21 Nov 2014, 03:44.

        Comment


        • #5
          Thank you Scott for being so helpful !!! My 1 million observation data set is running .

          May I ask: if I want to have a rolling of 12 standard deviation for each year (it could be 24 or other value, but to conserve space, I show example for 12 below), that is, for year 1998, I want to use the prior 12 months from y
          ear 1987, do I still use the same code , but after it is done, I need to select one standard deviation out of 12 for that year (probably the first one ) .

          g dm=mofd(date)


          tsset permno dm

          quietly rolling sd_ret = r(sd), clear window(12) step(1) : summarize ret


          below is an example
          permno date ret rollingstdev
          10008 31-Dec-85
          10008 31-Jan-86
          10008 28-Feb-86 0.11965812
          10008 31-Mar-86 0.083969466
          10008 30-Apr-86 0.028169014
          10008 30-May-86 -0.01369863
          10008 30-Jun-86 -0.048611112
          10008 31-Jul-86 -0.153284669
          10008 29-Aug-86 -0.060344826
          10008 30-Sep-86 -0.155963302
          10008 31-Oct-86 0.086956523
          10008 28-Nov-86 0.100000001 0.101436105
          10008 31-Dec-86 0.090909094 0.102045474
          10008 30-Jan-87 0 0.100723172
          10008 27-Feb-87 0.050000001 0.093554053

          given that some firms (identified by permno) may not start its return in January for each year, I should not always take January's rollingstdev for each year, is that correct? In the above example, should I use 0.101436105 (nov 28, 1996) as my 1996 rollingstdev or december value for as my 1996 rollingstdev. I wonder how I use the same rule in coding for all firms all years.

          thanks again !
          Rochelle

          Comment


          • #6
            Perfect timing. Today I completed writing a program asrol for calculating statistics over a rolling window . It can be installed from
            Code:
            net from "https://sites.google.com/site/imspeshawar"
            OR you may directly download it from this post and copy it to C:/ado/personal
            It similar to rolling and mvsumm, however, mvsumm does not work on unbalanced panel and rolling changes structure of the data in memory ( I assume), whereas asrol does not suffer from any of these issue. The help files are yet to be written, here is the basic structure with example.

            1. For Rolling Mean


            Code:
            webuse grunfeld
            asrol invest, stata(mean) win(4) gen(mean_4) 


            This command calculates mean for the variable invest using a four years rolling window and stores the results in a new variable called mean_4


            2. For Rolling standard deviation

            Code:
            asrol invest, stat(sd) win(5) gen(sd_5)

            This command calculates standard deviation for the variable invest using a five years rolling window and stores the results in a new variable called sd_5

            3. What Else in a rolling window



            asrol can find any statistics in a rolling window that can be calculated with egen function. Specifically, the following statistics can be calculated with asrol:
            • sd for standard deviation
            • mean for mean
            • skew for skewness
            • total for sum or total
            • median
            • mdev for mean absolute deviation
            • kurt for kurtosis
            Acknowledgement:
            asrol program is motivated by the mvsumm program written by Nick Cox and Kit Baum. mvsumm is more powerful than asrol, with only limitation that mvsumm does not work on data with gaps i.e. unbalanced data
            Attached Files
            Last edited by Attaullah Shah; 21 Nov 2014, 12:23.
            Regards
            --------------------------------------------------
            Attaullah Shah, PhD.
            Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
            FinTechProfessor.com
            https://asdocx.com
            Check out my asdoc program, which sends outputs to MS Word.
            For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

            Comment


            • #7
              Thank you Attaulah !

              If you could take another look at my 12:02 post today, would it still be necessary to do a rolling if I want one standard deviation per year using past 12 month data?

              Comment


              • #8
                Well, for the November SD and December SD, the choice should come from theory. If you are matching SD with stock returns or other accounting data, then lay out a clear principle of selection. Consider Fama and French (1992) method, they match accounting and stock market variables in clearly defined way. You may read their paper for taking a view.

                Regarding the second issue of one standard deviation per year, I think you should still use a rolling window of 252 observations( stock return days are 252 as stock markets are off on Saturday and Sunday) for calculation of SD. When it comes to selection of a specific value, you may refer to the above mentioned selection criteria. Since you would be using 252 rolling window, any value that you pick will be SD for whole year.
                Regards
                --------------------------------------------------
                Attaullah Shah, PhD.
                Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                FinTechProfessor.com
                https://asdocx.com
                Check out my asdoc program, which sends outputs to MS Word.
                For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                Comment


                • #9
                  Dear Attulah,
                  I would like to thank you first for the STATA code, it fills my exact need for my present work,
                  I have some inquiry i hope you could resolve, if possible,
                  After installing the code the asrol code in my stat v.11.2, I tried to apply it for a panel data sample (unblalanced).
                  I have quartely data, the original date file is YMD, that i transformed into YQ (Y for year and Q for Quarter) format,
                  When I set the command to calculate the rolling standard deviation (for a window of 4 quarters), I find it works for the first id observations (3 missed sd and start with the fourth), but starting from the second and afterward id's observations, i found only one missed sd. It is like STATA worked as if it was a window of 2 quarter for the 2nd, 3rd observation for eevery new id.
                  Can you please take a look to the attached file (dta file for 200 observations, with string date and transformed numeric date) and confirm me if your code works with quarterly data? Do you think the problem lies in the way date is declared ?

                  Thank you.
                  Attached Files
                  Last edited by Geneviève Gosselin; 19 Jun 2015, 11:23.

                  Comment


                  • #10
                    You are right, the problem is in the way Stata declares the panel data. Try this solution
                    Code:
                    sort ID  quarter_01
                    
                    bys ID: gen t=_n
                    
                    tsset ID t
                    
                    asrol roa, w(4) s(sd) g(sd)
                    Regards
                    --------------------------------------------------
                    Attaullah Shah, PhD.
                    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                    FinTechProfessor.com
                    https://asdocx.com
                    Check out my asdoc program, which sends outputs to MS Word.
                    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                    Comment


                    • #11
                      Thank you for your answer and suggest, that works. Thank you again.

                      Comment


                      • #12
                        hello to everyone. i need some help. i try Attaullah solutions but stata gives me ...."Rolling window length should be less than total number of periods"....
                        i have a panel data for 6years (2010-2015) time series by company name, year. i try for sd with asrol code but for 2010 and 2011 sd gives me no obsv.
                        please i need help i have stuck.
                        my panel data is like this. and the number of obs is 5682. please help because i study alone with a baby and no help.
                        thanks

                        Name year JONES1
                        AAK AB (publ) 2010
                        AAK AB (publ) 2011 -1.99e+08
                        AAK AB (publ) 2012 -2.27e+08
                        AAK AB (publ) 2013 -2.72e+08
                        AAK AB (publ) 2014 -2.74e+08
                        AAK AB (publ) 2015 -2.17e+08
                        AB SKF 2010
                        AB SKF 2011 -2.14e+08
                        AB SKF 2012 -2.25e+08
                        AB SKF 2013 -2.45e+08
                        AB SKF 2014 -2.13e+08
                        AB SKF 2015 -1.95e+08
                        ACS Actividades de Construccion y Servicios SA 2010
                        ACS Actividades de Construccion y Servicios SA 2011 -1.19e+08
                        ACS Actividades de Construccion y Servicios SA 2012 -1.32e+08
                        ACS Actividades de Construccion y Servicios SA 2013 -1.76e+08
                        ACS Actividades de Construccion y Servicios SA 2014 -2.01e+08
                        ACS Actividades de Construccion y Servicios SA 2015 -1.72e+08
                        ACTIVE Biotech AB 2010
                        ACTIVE Biotech AB 2011 -2.44e+08
                        ACTIVE Biotech AB 2012 -2.91e+08
                        ACTIVE Biotech AB 2013 -2.95e+08
                        ACTIVE Biotech AB 2014 -2.97e+08
                        ACTIVE Biotech AB 2015 -2.64e+08
                        ADA SA 2010
                        ADA SA 2011 -1.17e+08
                        ADA SA 2012 -1.31e+08
                        ADA SA 2013 -1.25e+08
                        ADA SA 2014 -6.08e+07
                        ADA SA 2015 -6.30e+07
                        ADL Partner SA 2010
                        ADL Partner SA 2011 -9.93e+07
                        ADL Partner SA 2012 -1.37e+08
                        ADL Partner SA 2013 -1.13e+08
                        ADL Partner SA 2014 -9.69e+07
                        ADL Partner SA 2015 -1.22e+08

                        Comment


                        • #13
                          Athina: What you are showing us? Values for 2011 on are all negative and enormous.

                          Please read FAQ Advice #12 and show us a data example we can copy and paste and exactly what code you used.

                          Meanwhile, tsegen (SSC) is, I believe, much more versatile than asrol (SSC). Search this forum for mentions.

                          Comment


                          • #14
                            hello Nick, thanks for your answer...i am using stata 13. and the values are residuals from regression (modified jones model). tsegen works perfect. the only problem is that for 2011(panel data 2010-2015) i can get values. is that possible?for the record for all 2010 values of residuals are 0. thanks again

                            Comment


                            • #15
                              If your panel length is just 6, it is not clear that you have much scope for rolling results any way. Otherwise:

                              Please read FAQ Advice #12 and show us a data example we can copy and paste and exactly what code you used.

                              Comment

                              Working...
                              X