Obtaining 5 year rolling standard deviation without using the rolling command (panel data)

Martin Pott

Join Date: May 2014

Posts: 12
#1

Obtaining 5 year rolling standard deviation without using the rolling command (panel data)

24 Jun 2014, 04:14

Dear all,

I am trying to obtain a 5 year (60 month) rolling standard deviation for the 'return'-variable in my dataset. After having tried a few things I am fairly confident that the following command will work:

Code:

rolling Var = r(Var), window(60) step(1): summarize return, d

(Thanks to Oliver Jones on http://www.stata.com/statalist/archi.../msg00917.html)

This will give the variances, after which I can take the square root in order to obtain standard deviations.

However, since I use financial data over a wide timespan and for a lot of companies (1200+), this will take my computer a little over a week if I'm correct. Therefore I would like to know if there perhaps is a less computationally heavy way to do this, and if not if this is indeed a correct way to do it (before I waste 10 days on a wrong command).

Thanks in advance.

Martin Pott
Tags: None
Janko Fromme

Join Date: Jun 2014

Posts: 9
#2

24 Jun 2014, 07:26

Dear Martin,

it isn't much, but you can save maybe 20% of your time if you don't set the detail option. Also, you can get the standard deviation directy by using r(sd).

Code:

rolling SD = r(sd), window(60) step(1): summarize return

You can save additional time by dropping variables that you don't need for the calculations. I don't know why, but it makes the calculations run faster.

Last edited by Janko Fromme; 24 Jun 2014, 07:32.
Comment
Martin Pott

Join Date: May 2014

Posts: 12
#3

24 Jun 2014, 07:32

Originally posted by Janko Fromme View Post

Dear Martin,

it isn't much, but you can save maybe 20% of your time if you don't set the detail option. Also, you can get the standard deviation directy by using r(sd).

Code:

rolling Var = r(sd), window(60) step(1): summarize return

You can save additional time by dropping variables that you don't need for the calculations. I don't know why, but it makes the calculations faster.

Dear Janko,

Thank you for the reply. If I understand correctly the command I use is correct and there is no 'real' shortcut.

If so, I will start running it again, but now directly for the standard deviation, instead of variance and drop the detail-option.

Thanks again.

Kind regards,

Martin
Comment
Janko Fromme

Join Date: Jun 2014

Posts: 9
#4

24 Jun 2014, 09:37

Dear Martin,

there might be a quicker way to do what you want to do, but since both summarize and rolling are built-in functions it seems unlikely to me. (Maybe somebody in this forum who is more knowledgeable than me has a different opinion?)

What I forgot to mention above: Make sure you suppress the output of rolling by either setting the nodots option, or putting quietly in front of the command. This should make things run a little quicker.

Code:

rolling Var = r(sd), window(60) step(1) nodots: summarize return quietly rolling Var = r(sd), window(60) step(1) : summarize return

Kind regards, Janko
Comment
Martin Pott

Join Date: May 2014

Posts: 12
#5

24 Jun 2014, 14:56

Dear Janko,

Thanks again for your response. I will do it like this and split up the data so I don't have to run everything at once.

Kind regards,

Martin
Comment
Jonathan Black

Join Date: Sep 2017

Posts: 2
#6

01 Sep 2017, 08:16

Dear Martin,

I hope you are not still running this code as it has been a few years. I had to calculate similar values and found it much, much faster to just code the variance calculation using lagged values.I needed 2 year lagged variance for firm-month observations. For example:

xtset firm year

gen mean24 = (ret+l.ret+l2.ret+l3.ret+l4.ret+l5.ret+l6.ret+l7.r et+l8.ret+l9.ret+l10.ret ///
+l11.ret+l12.ret+l13.ret+l14.ret+l15.ret+l16.ret+l 17.ret+l18.ret+l19.ret+l20.ret+l21.ret+l22.ret+l23 .ret)/24

gen var24 = (((ret-mean24)^2)+((l.ret-mean24)^2)+((l2.ret-mean24)^2)+((l3.ret-mean24)^2)+((l4.ret-mean24)^2) ///
+((l5.ret-mean24)^2)+((l6.ret-mean24)^2)+((l7.ret-mean24)^2)+((l8.ret-mean24)^2)+((l9.ret-mean24)^2)+((l10.ret-mean24)^2) ///
+((l11.ret-mean24)^2)+((l12.ret-mean24)^2)+((l13.ret-mean24)^2)+((l14.ret-mean24)^2)+((l15.ret-mean24)^2)+((l16.ret-mean24)^2) ///
+((l17.ret-mean24)^2)+((l18.ret-mean24)^2)+((l19.ret-mean24)^2)+((l20.ret-mean24)^2)+((l21.ret-mean24)^2)+((l22.ret-mean24)^2) ///
+((l23.ret-mean24)^2))/24

All the best,
Jon
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#7

01 Sep 2017, 08:27

You can also use rangestat, see https://www.statalist.org/forums/for...13#post1407813
3 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#8

01 Sep 2017, 08:29

Jonathan Black If you have monthly observations, you shouldn't want to to declare year as the time variable to xtset

We now have rangestat (SSC) by Robert Picard and others which allows for identifier (say firm) and monthly date (say mdate)

Code:

rangestat (variance) ret, interval(mdate -23 0) by(firm)

I think you'll find its speed competitive with your approach (include time to write out that code).
3 likes
Comment
Jonathan Black

Join Date: Sep 2017

Posts: 2
#9

01 Sep 2017, 11:04

Thanks! Apologies for the typo.
Comment
Neyla Tad

Join Date: Sep 2017

Posts: 29
#10

23 Sep 2017, 21:01

Dear all,
my data is quarterly, and I want to calculate a standard deviation of assets in a three consecutives quarters in a rolling window. How can I use the command described above?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#11

23 Sep 2017, 22:30

Assuming your date variable is a Stata internal format quarterly date variable, you just have to change the value of the -interval()- option in the -rangestat- command shown in #8 and replace the return variable by your assets variable:

Code:

rangestat (sd) assets, by(firm) interval(quarterly_date -3 -1)

gives you the rolling standard deviation of each firm's assets over a window from the preceding quarter to three quarters preceding. If you want the window to run from the present quarter through two quarters preceding, change -3 -1 to -2 0.
Comment

Bo Wang

Join Date: Feb 2016
Posts: 12

#12

25 Jan 2018, 08:49

Hi,

I would like to calculate rolling stock return volatility from today and previous 5 days. I tried the rangestat code, but the results seem not be what I intended.

Code:

rangestat (sd) ret, by(permno) interval(date -5 0)

and my data and the results is the following:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double permno long date double(prc ret ret_sd)
10016 11324  13.75                    0                   .
10016 11325   13.5   -.0181818176060915  .01285648652356426
10016 11326   13.5                    0 .010497277289233603
10016 11329  14.25    .0555555559694767  .03197811340445808
10016 11330     14 -.017543859779834747 .034768181264645565
10016 11331     14                    0  .03179614522364177
10016 11332     14                    0  .03179614522364177
10016 11333  13.25   -.0535714291036129  .03941980394923585
10016 11336     14  .056603774428367615 .044987355690974645
10016 11337     13   -.0714285746216774  .05775354396836087
10016 11338  12.75  -.01923076994717121  .05664546563746416
10016 11339     13  .019607843831181526  .05479700236817698
10016 11340  13.75  .057692307978868484  .05480579858843518
10016 11343     13  -.05454545468091965 .048493279819819574
77405 11757      .                    .                   .
77405 11758      .                    .                   .
77405 11759   19.5                    .                   .
77405 11760     21   .07692307978868485                   .
77405 11763  21.25  .011904762126505375 .045974893320268176
77405 11764   21.5    .0117647061124444  .03757883914172917
77405 11765 22.125  .029069768264889717 .030763962877399114
77405 11766  22.75  .028248587623238564 .009719548800800169
77405 11767 22.625 -.005494505632668734 .014261023504387783
77405 11770  21.25  -.06077348068356514 .042216172671245916
77405 11771     21   -.0117647061124444  .03669869337521122
77405 11772     22    .0476190485060215 .044340877313943504
77405 11773     24   .09090909361839294  .06652115652528198
77405 11774  25.25    .0520833320915699 .059766785484860176
end
format %d date

If the command can generate standard deviation from today to previous 5-day, would the first four observations have missing standard deviation? But in my sample, only the first observation of each permno has missing standard deviation. This seems not to be the standard deviation of (-5,0) window.

Is there a way to fix this and calculate the rolling standard deviation of stock return for (-5,0).

Thanks in advance!

Bo

Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

#13

25 Jan 2018, 09:54

You are getting exactly what you requested, the standard deviation of ret for observations that fall within a window that includes the previous 5 days plus the day of the current observation (a 6-day window). With rangestat, you can always calculate a result for a specific observation using standard Stata commands. For example:

Code:

rangestat (sd) sd5=ret (count) N=ret, by(permno) interval(date -5 0)

list N ret_sd in 5
sum ret if permno == permno[5] & inrange(date, date[5]-5, date[5])

list N ret_sd in 8
sum ret if permno == permno[8] & inrange(date, date[8]-5, date[8])

and the results

Code:

.
. list N ret_sd in 5

     +---------------+
     | N      ret_sd |
     |---------------|
  5. | 4   .03476818 |
     +---------------+

. sum ret if permno == permno[5] & inrange(date, date[5]-5, date[5])

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         ret |          4    .0049575    .0347682  -.0181818   .0555556

.
. list N ret_sd in 8

     +--------------+
     | N     ret_sd |
     |--------------|
  8. | 5   .0394198 |
     +--------------+

. sum ret if permno == permno[8] & inrange(date, date[8]-5, date[8])

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         ret |          5   -.0031119    .0394198  -.0535714   .0555556

.
end of do-file

.

There are gaps in your data so the number of observations that fall within the specified window is always less than 6 in your data example. You can use the N variable to set sd5 to missing when there are fewer observations than you care for. I suspect you may want to investigate business calendars (help datetime_business_calendars) if you want to ignore weekends and holidays.

Comment

Bo Wang

Join Date: Feb 2016

Posts: 12
#14

25 Jan 2018, 10:50

Hi Robert,

Thank you so much for your reply. I have now realized the problem. I wasn't aware thatrangestatwill take weekends into consideration. What I wanted is to calculate is the 6 trading days stock return volatility (ignore weekends and holidays). To datetime business calendars is crucial for me. However, I have a problem with converting my %td format date to %tb format business calendar date. When I converting the date using the following code, the new business date generated are all missing.

Code:

generate Businessdate = bofd("simple", date)

I have checked that the business calendar "simple.stbcal" is installed on my computer. Is there another to calculate the rolling volatility of my panel data?

Thank you so much for your time.

Bo
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

#15

25 Jan 2018, 12:19

I do not use business calendars myself so you are kind of asking me to read the documentation for you. I'll still point you in the right direction: you need to create a business calendar that is suitable for your dataset. See help bcal. Something like:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double permno long date double(prc ret ret_sd)
10016 11324  13.75                    0                   .
10016 11325   13.5   -.0181818176060915  .01285648652356426
10016 11326   13.5                    0 .010497277289233603
10016 11329  14.25    .0555555559694767  .03197811340445808
10016 11330     14 -.017543859779834747 .034768181264645565
10016 11331     14                    0  .03179614522364177
10016 11332     14                    0  .03179614522364177
10016 11333  13.25   -.0535714291036129  .03941980394923585
10016 11336     14  .056603774428367615 .044987355690974645
10016 11337     13   -.0714285746216774  .05775354396836087
10016 11338  12.75  -.01923076994717121  .05664546563746416
10016 11339     13  .019607843831181526  .05479700236817698
10016 11340  13.75  .057692307978868484  .05480579858843518
10016 11343     13  -.05454545468091965 .048493279819819574
end
format %td date

bcal create mycal, from(date) gen(bdate) replace

rangestat (sd) sd6=ret (count) N=ret, by(permno) interval(bdate -5 0)

Announcement