Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Passing from monthly to yearly returns

    firm_id date ret
    10001 11/30/2015 -0.00907
    10001 12/31/2015 -0.13215
    10001 01/29/2015 0.116779
    10001 02/29/2015 -0.05529
    10001 03/31/2015 -0.00636
    10001 04/29/2015 -0.0557
    10001 05/31/2015 -0.02192
    10001 06/30/2015 -0.02101
    10001 07/29/2015 0.026466
    10001 08/31/2015 0.03662
    10001 09/30/2015 0.04212
    10001 10/31/2016 0.619948
    10001 11/30/2016 0.012146
    10001 12/30/2016 0.01
    10002 01/31/3016 0.05
    10002 02/28/2016 0.139785
    10002 03/31/2016 0.070755
    10002 04/30/2016 0.052863
    10002 05/30/2016 -0.02092
    10002 06/30/2016 -0.13248
    10002 07/31/2016 0.049261
    10002 08/29/2016 0
    10002 09/30/2016 0.070423
    10002 10/31/2016 -0.0614
    10002 11/28/2016 0
    10002 12/31/2016 -0.07477
    10002 01/30/2017 0.09596
    10002 02/27/2017 -0.01382
    How do I go from mothly to annualy returns in stat ? Thank you

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int firm_id float(date ret)
    10001  20422 -.00907
    10001  20453 -.13215
    10001  20117 .116779
    10001  20147 -.05529
    10001  20178 -.00636
    10001  20207  -.0557
    10001  20239 -.02192
    10001  20269 -.02101
    10001  20298 .026466
    10001  20331  .03662
    10001  20361  .04212
    10001  20758 .619948
    10001  20788 .012146
    10001  20818     .01
    10002 385726     .05
    10002  20512 .139785
    10002  20544 .070755
    10002  20574 .052863
    10002  20604 -.02092
    10002  20635 -.13248
    10002  20666 .049261
    10002  20695       0
    10002  20727 .070423
    10002  20758  -.0614
    10002  20786       0
    10002  20819 -.07477
    10002  20849  .09596
    10002  20877 -.01382
    end
    format %td date
    
    gen int mdate = mofd(date)
    format mdate %tm
    gen int year = year(date)
    egen panel = group(firm_id year)
    xtset panel mdate
    
    by panel (mdate), sort: gen ann_ret = 1 + ret if _n == 1
    by panel: replace ann_ret = L1.ann_ret*(1+ret) if _n > 1
    collapse (last) ann_ret, by(firm_id year)
    replace ann_ret = ann_ret - 1
    does the job.

    You cannot compute the annual return for firm_id 10001 in 2015 because there is no data for October 2015. And the return for 10001 in 2016 is outlandishly high because your first observation for that firm in that year is dated 31 January 3016. Once you correct these data errors, it's probably a good idea to really scrutinize the way this data set was created and fix it. When there are two gross errors in such a small sample, the data set as a whole seems like it is really of poor quality, and you should not rest easy just fixing the errors that are easy to find. There is little reason to trust the accuracy of any of it under these conditions.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Last edited by Clyde Schechter; 22 Mar 2022, 17:29.

    Comment


    • #3
      Hello Sir,

      Thank you very much for your help and I noted on how to correctly make an example.
      However, I got two errors code with your formula

      . xtset panel mdate
      repeated time values within panel
      r(451);

      by panel: replace ann_ret = L1.ann_ret*(1+RET) if _n > 1
      time variable not set

      I tried with one of your formula in another forum answer and it seems to work, it was this one but I am a noob and I can't understand the the difference between this one and which you are giving me:
      gen int fy = year(date)
      gen log_one_plus_return = log(1+ret)
      egen annual_return = total(log_one_plus_return), by(permno fy)
      replace annual_return = exp(annual_return) - 1

      Thank you very much

      Comment


      • #4
        . xtset panel mdate
        repeated time values within panel
        r(451);
        That is a giant red flag!! Your data are wrong and you need to fix them. The results you got with the other code should not be trusted.

        That message tells you that there is some firm-year combination for which you have multiple observations within the same month. So either it is a duplicate observation that would be counted twice, or you have inconsistent observations and you need to reconcile which one has the correct monthly return. To find the offending observations:

        Code:
        duplicates tag firm_id year mdate, gen(flag)
        browse if flag
        If you see the observation(s) in question are exact duplicates, then -duplicates drop- will solve the immediate problem. If the observations disagree, then you will need to figure out which is correct (or whether perhaps they must be combined in some way to produce a correct observation for that firm in that month. (Or perhaps the observations are actually for two different months and one of them has the wrong date! Something like that might account for the gap in your example data that I pointed out in #2.)

        But either way, this suggests that something went wrong in the data management that led to this data set in the first place. So you should do a thorough review of that, because, having already discovered this error in the data, you should assume that other errors were made along the way as well. You should not proceed with any analysis until you have thoroughly vetted the creation of the data set.

        The alternative code you show is, except for possible rounding errors, equivalent to the code I gave in #2. Instead of multiplying 1+monthly return at each step, it first takes the logarithms, adds them, and finally exponentiates the result. It's just using addition of logs to multilply. (Of course, with the wrong data you have, it will produce incorrect answers, and the way it was written it did not notice and point out the errors, so it is, in that respect, inferior code. In a very large data set, it would also be noticeably slower because of the need to calculate logarithms. To be honest, I don't know why I wrote that code. I wonder if the original poster explicitly asked that that method be used.)
        Last edited by Clyde Schechter; 24 Mar 2022, 18:37.

        Comment


        • #5
          It works thank you chief !!

          Comment

          Working...
          X