Hi,
I am new to Stata and am trying to create annualized volatilities from daily stock returns. Through CRSP I obtained stock price and daily return data on 106 companies for the period of January 2000 till dec 2014. Using this return data I want to create an annualized volatility of each of the 106 companies in my sample (I will later use this annualized volatility as input for the black & Scholes option pricing formula).
The CRSP data contains the following variables: CompanyID, Date, logreturn.
Up until now I have defined the data as panel data by using companyID as the panelvar and Date as the timevar.
'xtset CompanyID Date'
One of the problems I am facing here is that Stata gives the error message: 'repeated time values within panel'. Which somewhat makes sense as the dates are repeated for each company (however should this not give an error message?) The data is first sorted by companyID and then by date. Meaning that e.g. the first ~3750 observations are the daily returns for the years 2000-2014 for company 1 and the following ~3750 observation are the daily returns for company 2. Eventually resulting in about 280000 rows.
My goal is to determine an annualized volatility for each company individually by taking the sd of all the daily return observations in a specific year (so for 2000, 2001....etc) and multiplying it by the sqrt of the number of observations in that specific year. The problem I am facing here is that the number of observations differ per year (from about 245 till 252) and that the first day and final day of each year do not always fall on the same day. i.e. sometimes the first day of observations falls on the 2nd of January and sometimes on the 3rd of January. The last day of observations of each years differs between the 29th,30th or 31st of December.
Does anyone know how to use this data in order to obtain the annualized volatility for each company in each specific year with a simple code? The data in total includes more than 280000 return observations so manually calculating will require a lot of work.
To clarify, Below is a graphic example display of my data
CompanyID | Date | Log return |
10138 | 20000103 | -.0078 |
10138 | 20000104 | -.009 |
10138 | 20000105 | -.045 |
10138 | 20000106 | .0249 |
10138 | 20000107 | .0015 |
10138 | 20000110 | .006 |
............ | ........... | .............. |
........... | ........... | .............. |
10138 | 20001227 | 3 |
10138 | 20001228 | 3 |
10138 | 20001229 | 3 |
................ | ......... | .............. |
10138 | 20010102 | .025 |
10138 | 20010103 | .004 |
10138 | 20010104 | -.003 |
10138 | 20010105 | -.0045 |
10138 | 20010108 | .01 |
.......... | ........... | ............ |
Comment