Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate standard deviation of daily stock returns per firm-year in panel?

    Hi Statalisters,

    I could use some help calculating the annualized standard deviation of daily stock returns (total risk) for my dataset. I am fairly new to stata and don't really know one would code this.

    I have a panel of CRSP daily stock return data from 2006 - 2017 for 3822 unique firms (permco), approx. 7 million observations.
    My time variable = date (business date). See -dataex- below.

    I have already calculated the log daily stock return (lret) using:
    Code:
    bysort permco (date): gen lret = (ln(prc)-ln(prc[_n-1]))
    What I want is to have one observation per firm-year containing the annualized sd of daily stock returns.
    So for permco 7 (APPLE COMPUTER INC) I would have the annualized sd of daily stock returns for each 2006, 2007, 2008, 2009, 2010....... 2017
    and for the next firm permco 33 (MOLSON COORS BREWING CO) <-- not in dataex. I would also have the annualized sd for 2006, 2007, 2008 etc.

    My assumption is that there are 252 trading days in a year. Of course this varies, some year have 252 tradings, some have 250.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long permco str32 comnam str8 date float(lret mktrf smb hml rf)
    7 "APPLE COMPUTER INC" "20060103"            .  1.59 -.26  .16 .02
    7 "APPLE COMPUTER INC" "20060104"  .0029388375   .52  .43 -.14 .02
    7 "APPLE COMPUTER INC" "20060105"  -.007900998  -.08  .16 -.21 .02
    7 "APPLE COMPUTER INC" "20060106"    .02548593   .93  .09 -.33 .02
    7 "APPLE COMPUTER INC" "20060109" -.0032819195   .36  .52 -.32 .02
    7 "APPLE COMPUTER INC" "20060110"    .06132821   .09  .54  .01 .02
    7 "APPLE COMPUTER INC" "20060111"    .03690636   .31 -.13 -.11 .02
    7 "APPLE COMPUTER INC" "20060112"  .0046494706  -.64  .16  .08 .02
    7 "APPLE COMPUTER INC" "20060113"    .01529215   .15  .21  .13 .02
    7 "APPLE COMPUTER INC" "20060117"  -.010332434  -.42  -.2  .16 .02
    7 "APPLE COMPUTER INC" "20060118"   -.02655657  -.47   .2   .1 .02
    7 "APPLE COMPUTER INC" "20060119"   -.04278741   .75  .94 -.14 .02
    7 "APPLE COMPUTER INC" "20060120"   -.03798717 -1.58  .54  .44 .02
    7 "APPLE COMPUTER INC" "20060123"    .02056539   .28  .19  .34 .02
    7 "APPLE COMPUTER INC" "20060124"   -.02120953   .35  .84 -.09 .02
    7 "APPLE COMPUTER INC" "20060125"   -.02449542  -.24  .19 -.04 .02
    7 "APPLE COMPUTER INC" "20060126"    -.0255251   .77   .6 -.01 .02
    7 "APPLE COMPUTER INC" "20060127"  -.004156324   .71  -.1 -.17 .02
    7 "APPLE COMPUTER INC" "20060130"    .04040543   .22 -.04  .12 .02
    7 "APPLE COMPUTER INC" "20060131"   .006777013  -.18  .56  .11 .02
    7 "APPLE COMPUTER INC" "20060201" -.0011926586   .08  .17  -.1 .02
    7 "APPLE COMPUTER INC" "20060202"   -.04501845  -.85 -.16 -.03 .02
    7 "APPLE COMPUTER INC" "20060203" -.0035151634  -.44   .2  .29 .02
    7 "APPLE COMPUTER INC" "20060206"   -.06537858   .18  .25  .58 .02
    7 "APPLE COMPUTER INC" "20060207"  .0044476786  -1.1 -.61 -.19 .02
    7 "APPLE COMPUTER INC" "20060208"   .017741088   .68 -.28 -.16 .02
    7 "APPLE COMPUTER INC" "20060209"   -.05773135  -.19 -.02  .04 .02
    7 "APPLE COMPUTER INC" "20060210"   .035691082   .04  -.6  .05 .02
    7 "APPLE COMPUTER INC" "20060213"   -.03939304  -.55 -.72  .25 .02
    7 "APPLE COMPUTER INC" "20060214"    .04435766   .95  .21 -.08 .02
    7 "APPLE COMPUTER INC" "20060215"   .023016464    .3  .27 -.47 .02
    7 "APPLE COMPUTER INC" "20060216"   .019315265   .79  .23  .05 .02
    7 "APPLE COMPUTER INC" "20060217" -.0039755665  -.06  .16   .2 .02
    7 "APPLE COMPUTER INC" "20060221"  -.017364275  -.36 -.11  .31 .02
    7 "APPLE COMPUTER INC" "20060222"   .031911507   .63 -.04  .04 .02
    7 "APPLE COMPUTER INC" "20060223"   .006011066  -.34   .2 -.24 .02
    7 "APPLE COMPUTER INC" "20060224"  -.004050015   .28  .39  .18 .02
    7 "APPLE COMPUTER INC" "20060227"  -.006598848   .34  .06  -.4 .02
    7 "APPLE COMPUTER INC" "20060228"  -.035851274  -.96 -.07  .29 .02
    7 "APPLE COMPUTER INC" "20060301"   .008866991   .91  .56 -.08 .02
    7 "APPLE COMPUTER INC" "20060302"   .007353535  -.04  .03  .02 .02
    7 "APPLE COMPUTER INC" "20060303"   -.02752667   -.1 -.04 -.05 .02
    7 "APPLE COMPUTER INC" "20060306"   -.03363677  -.82 -.26 -.14 .02
    7 "APPLE COMPUTER INC" "20060307"   .012595875  -.49 -1.1  .19 .02
    7 "APPLE COMPUTER INC" "20060308"   -.00985071   .03  -.2 -.05 .02
    7 "APPLE COMPUTER INC" "20060309"   -.02670123  -.48  .04  .11 .02
    7 "APPLE COMPUTER INC" "20060310"    -.0116427   .68  .26 -.01 .02
    7 "APPLE COMPUTER INC" "20060313"    .03864843    .3  .08  .18 .02
    7 "APPLE COMPUTER INC" "20060314"   .024662895   .99 -.04   .2 .02
    7 "APPLE COMPUTER INC" "20060315"  -.016323783   .51  .35  .08 .02
    end

    I would really appreciate your help with this!

    Kind regards,

    Shaquille Wijngaarde


  • #2
    I have never heard of an "annualized" standard deviation and really don't know what that might mean. On the chance that you simply want the standard deviation of the return in each year for each firm, you can do that with:

    Code:
    gen int year = real(substr(date, 1, 4))
    gen ret = exp(lret)
    by permco year, sort: egen wanted = sd(ret)
    A couple of side issues:

    1. Your date variable is in about the most difficult form you can get, and I suspect that you will sooner or later stumble over that. So at some point you will probably need to create a real Stata internal format date variable from it:
    Code:
    gen internal_date = daily(date, "YMD")
    format internal_date %td
    2. Calculating the log return as the difference in log price is inefficient. Taking logarithms is slow. Better is log of the quotient, which is mathematically the same but faster to calculate.

    3. It is unclear why you are even going through logarithms to get the return. As you see above, to get the return, it is necessary to re-exponentiate the logarithm. So it would be much more straightforward and efficient to just calculate the return directly as the ratio of the price to the previous price and skip the dance with logarithms..

    Comment


    • #3
      Hello Clyde,

      Thank you for you reply. I have come across many posts with your replies and they have been very helpful so I appreciate your help!
      By stating 'annualized standard deviation of daily stock returns (total risk)', I am simply repeating the wording of one of the main papers I have consulted (Serfling, 2014).
      I guess I expected this to be the common way that it is described.

      I do have a easier date variable, I created business calendar dates (bdate) that displays as e.g. 03jan2006 but for some reason -dataex- is outputting bdate in 'stata language' so I figured I'd use something more clear.

      As for log(return) I am simply doing it the way I was taught, but I get your point and will do it, as you recommend.

      Perhaps I can rephrase my question:
      I want to use stock return volatility (total risk) as a measure of overall firm risk, it is one of the dependent variables in my research.

      This is the check list that I am trying to follow:
      1. collect daily closing price per firm from 2006-2017 x
      2. calculate daily return x
      3. SD of daily stock returns
      4. calculate variance, i.e. square of SD
      5. number of tranding days assumption = 252
      6. annualized variance, i.e. daily variance * 252
      7. SQRT of annualized variance

      Does this make more sense? Apologies if I am writing these things too simple or using incorrect academic language.

      I actually came across, this post that you replied on:
      https://www.statalist.org/forums/for...year-end-dates

      This seems to be my problem exactly, but when I tried Asad's code and it gave me some weird results. I also don't really understand what this code is trying to do?
      Could you please explain to me how this works and/or help to get this code to work for my data?

      I thank you in advance for your help and patience.

      Comment


      • #4
        Thank you for the clarification.

        The post you linked to asks a different question, so the answer is different and its results will not look sensible to you. That post asked about a running standard deviation.

        Your question is simpler. And actually, you can do better than just assuming the number of trading days is 252. You can base it on the actual number of trading days by counting them up. Once you have your year and ret variables:

        Code:
        by permco year, sort: egen std_dev= sd(ret)
        by permco year, sort: egen trading_days = count(ret)
        gen annualized_sd = sqrt(trading_days * std_dev)
        The term "annualized_sd" may indeed be quite standard in your field. But this is an international, multidisciplinary forum. So when posing a question, you should use only use terminology that every generally well-educated person would understand, plus ordinary statistical terminology. Never assume that others will understand terms just because they are commonplace in your discipline.



        Comment

        Working...
        X