Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating annualized volatility from daily stock returns


    Hi,

    I am new to Stata and am trying to create annualized volatilities from daily stock returns. Through CRSP I obtained stock price and daily return data on 106 companies for the period of January 2000 till dec 2014. Using this return data I want to create an annualized volatility of each of the 106 companies in my sample (I will later use this annualized volatility as input for the black & Scholes option pricing formula).

    The CRSP data contains the following variables: CompanyID, Date, logreturn.

    Up until now I have defined the data as panel data by using companyID as the panelvar and Date as the timevar.

    'xtset CompanyID Date'

    One of the problems I am facing here is that Stata gives the error message: 'repeated time values within panel'. Which somewhat makes sense as the dates are repeated for each company (however should this not give an error message?) The data is first sorted by companyID and then by date. Meaning that e.g. the first ~3750 observations are the daily returns for the years 2000-2014 for company 1 and the following ~3750 observation are the daily returns for company 2. Eventually resulting in about 280000 rows.

    My goal is to determine an annualized volatility for each company individually by taking the sd of all the daily return observations in a specific year (so for 2000, 2001....etc) and multiplying it by the sqrt of the number of observations in that specific year. The problem I am facing here is that the number of observations differ per year (from about 245 till 252) and that the first day and final day of each year do not always fall on the same day. i.e. sometimes the first day of observations falls on the 2nd of January and sometimes on the 3rd of January. The last day of observations of each years differs between the 29th,30th or 31st of December.

    Does anyone know how to use this data in order to obtain the annualized volatility for each company in each specific year with a simple code? The data in total includes more than 280000 return observations so manually calculating will require a lot of work.

    To clarify, Below is a graphic example display of my data
    CompanyID Date Log return
    10138 20000103 -.0078
    10138 20000104 -.009
    10138 20000105 -.045
    10138 20000106 .0249
    10138 20000107 .0015
    10138 20000110 .006
    ............ ........... ..............
    ........... ........... ..............
    10138 20001227 3
    10138 20001228 3
    10138 20001229 3
    ................ ......... ..............
    10138 20010102 .025
    10138 20010103 .004
    10138 20010104 -.003
    10138 20010105 -.0045
    10138 20010108 .01
    .......... ........... ............
    ...
    Last edited by Willem Hasenaar; 20 Oct 2015, 06:56.

  • #2
    First, is your Date variable correctly identified as a date? Look at the date material in the manual and make sure that Stata recognizes Date as a date. This makes everything much easier.

    It you xtset by company and date, you should only have one observation per day per company and xtset should work. You may have multiple observations for a given company in a given day. You have to drop some of these. Ignoring the xtset problem, you can't calculate the correct standard deviation for a stock in a given year if you have multiple observations for some days and one for others. You should look at the duplicates command which will help you identify the problem observations.

    While you'll want to xtset for other reasons, the calculation doesn't necessarily require it. Once you have the data clean, then you can generate a year variable from the date and use that:

    g year=year(date)
    bysort CompanyID year: egen stddeviation=sd(Logreturn)

    You can use a similar egen command to count the number days in each year, etc. Then you use generate to do the arithmetic.

    Looking at your example, I'd be very worried about the data themselves - you have three days with identical log returns of 3 out to 4 decimal places [i.e., a real return of 20x when the normal returns are in the order of .01x]. Something is very wrong here. It is almost impossible for a stock to produce a log return of precisely 3 (out to four decimal places) in three sequential days, let alone a 2000% return in three sequential days.

    Comment


    • #3
      Note that this question (#1) was posted twice.

      Other replies within http://www.statalist.org/forums/foru...-stock-returns

      Comment

      Working...
      X