Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating 5-year compound annual growth rate (CAGR) with missing values

    Hi All,

    I'm trying to calculate the 5-year compound annual growth rate, which is "(endvalue/beginning value)^(1/5) - 1", of a variable for different countries and for each year. Some countries have missing values for one or several years.

    The growth rate I've calculated are all missing values, which should not be. I can't figure out where's the problem...

    The data I use:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3 CountryCode str22 Country int Year double var1
    "AFG" "Afghanistan" 2000  .
    "AFG" "Afghanistan" 2001 20
    "AFG" "Afghanistan" 2002 16
    "AFG" "Afghanistan" 2003  4
    "AFG" "Afghanistan" 2004 10
    "AFG" "Afghanistan" 2005 28
    "AFG" "Afghanistan" 2006 16
    "AFG" "Afghanistan" 2007 29
    "AFG" "Afghanistan" 2008 27
    "AFG" "Afghanistan" 2009 32
    "AFG" "Afghanistan" 2010 14
    "AFG" "Afghanistan" 2011 13
    "AFG" "Afghanistan" 2012 32
    "AFG" "Afghanistan" 2013 27
    "AFG" "Afghanistan" 2014 10
    "AFG" "Afghanistan" 2015 32
    "AFG" "Afghanistan" 2016 31
    "AFG" "Afghanistan" 2017  .
    "AGO" "Angola"      2000 15
    "AGO" "Angola"      2001  7
    "AGO" "Angola"      2002  .
    "AGO" "Angola"      2003  .
    "AGO" "Angola"      2004  .
    "AGO" "Angola"      2005  .
    "AGO" "Angola"      2006  .
    "AGO" "Angola"      2007  .
    "AGO" "Angola"      2008  .
    "AGO" "Angola"      2009  .
    "AGO" "Angola"      2010  .
    "AGO" "Angola"      2011  .
    "AGO" "Angola"      2012  .
    "AGO" "Angola"      2013  .
    "AGO" "Angola"      2014  .
    "AGO" "Angola"      2015  .
    "AGO" "Angola"      2016  .
    "AGO" "Angola"      2017  .
    "ALB" "Albania"     2000  1
    "ALB" "Albania"     2001 15
    "ALB" "Albania"     2002  .
    "ALB" "Albania"     2003  .
    "ALB" "Albania"     2004  .
    "ALB" "Albania"     2005  .
    "ALB" "Albania"     2006  .
    "ALB" "Albania"     2007  .
    "ALB" "Albania"     2008  .
    "ALB" "Albania"     2009  5
    "ALB" "Albania"     2010 15
    "ALB" "Albania"     2011 27
    "ALB" "Albania"     2012  .
    "ALB" "Albania"     2013  .
    "ALB" "Albania"     2014  .
    "ALB" "Albania"     2015  .
    "ALB" "Albania"     2016  .
    "ALB" "Albania"     2017  .
    "ARG" "Argentina"   2000  .
    "ARG" "Argentina"   2001  .
    "ARG" "Argentina"   2002  .
    "ARG" "Argentina"   2003  .
    "ARG" "Argentina"   2004  .
    "ARG" "Argentina"   2005  .
    "ARG" "Argentina"   2006 10
    "ARG" "Argentina"   2007  .
    "ARG" "Argentina"   2008  .
    "ARG" "Argentina"   2009  .
    "ARG" "Argentina"   2010  .
    "ARG" "Argentina"   2011  .
    "ARG" "Argentina"   2012 18
    "ARG" "Argentina"   2013  .
    "ARG" "Argentina"   2014  .
    "ARG" "Argentina"   2015  .
    "ARG" "Argentina"   2016  .
    "ARG" "Argentina"   2017  .
    "ARM" "Armenia"     2000 33
    "ARM" "Armenia"     2001  7
    "ARM" "Armenia"     2002 21
    "ARM" "Armenia"     2003 23
    "ARM" "Armenia"     2004  8
    "ARM" "Armenia"     2005  3
    "ARM" "Armenia"     2006  9
    "ARM" "Armenia"     2007  5
    "ARM" "Armenia"     2008  4
    "ARM" "Armenia"     2009 23
    "ARM" "Armenia"     2010 16
    "ARM" "Armenia"     2011 28
    "ARM" "Armenia"     2012 21
    "ARM" "Armenia"     2013 26
    "ARM" "Armenia"     2014 10
    "ARM" "Armenia"     2015 20
    "ARM" "Armenia"     2016 10
    "ARM" "Armenia"     2017 34
    "AUS" "Australia"   2000 34
    "AUS" "Australia"   2001 17
    "AUS" "Australia"   2002 16
    "AUS" "Australia"   2003 16
    "AUS" "Australia"   2004 12
    "AUS" "Australia"   2005  2
    "AUS" "Australia"   2006 14
    "AUS" "Australia"   2007  6
    "AUS" "Australia"   2008 31
    "AUS" "Australia"   2009  2
    end
    The code I use:
    Code:
    encode CountryCode, gen(ccode)
    xtset ccode Year
    order ccode, first
    bysort ccode Year: gen var1_5CAGR = ((var1/L5.var1)^(1/5)) - 1
    The results I got:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long ccode str3 CountryCode str22 Country int Year double var1 float var1_5CAGR
    1 "AFG" "Afghanistan" 2000  . .
    1 "AFG" "Afghanistan" 2001 20 .
    1 "AFG" "Afghanistan" 2002 16 .
    1 "AFG" "Afghanistan" 2003  4 .
    1 "AFG" "Afghanistan" 2004 10 .
    1 "AFG" "Afghanistan" 2005 28 .
    1 "AFG" "Afghanistan" 2006 16 .
    1 "AFG" "Afghanistan" 2007 29 .
    1 "AFG" "Afghanistan" 2008 27 .
    1 "AFG" "Afghanistan" 2009 32 .
    1 "AFG" "Afghanistan" 2010 14 .
    1 "AFG" "Afghanistan" 2011 13 .
    1 "AFG" "Afghanistan" 2012 32 .
    1 "AFG" "Afghanistan" 2013 27 .
    1 "AFG" "Afghanistan" 2014 10 .
    1 "AFG" "Afghanistan" 2015 32 .
    1 "AFG" "Afghanistan" 2016 31 .
    1 "AFG" "Afghanistan" 2017  . .
    2 "AGO" "Angola"      2000 15 .
    2 "AGO" "Angola"      2001  7 .
    2 "AGO" "Angola"      2002  . .
    2 "AGO" "Angola"      2003  . .
    2 "AGO" "Angola"      2004  . .
    2 "AGO" "Angola"      2005  . .
    2 "AGO" "Angola"      2006  . .
    2 "AGO" "Angola"      2007  . .
    2 "AGO" "Angola"      2008  . .
    2 "AGO" "Angola"      2009  . .
    2 "AGO" "Angola"      2010  . .
    2 "AGO" "Angola"      2011  . .
    2 "AGO" "Angola"      2012  . .
    2 "AGO" "Angola"      2013  . .
    2 "AGO" "Angola"      2014  . .
    2 "AGO" "Angola"      2015  . .
    2 "AGO" "Angola"      2016  . .
    2 "AGO" "Angola"      2017  . .
    3 "ALB" "Albania"     2000  1 .
    3 "ALB" "Albania"     2001 15 .
    3 "ALB" "Albania"     2002  . .
    3 "ALB" "Albania"     2003  . .
    3 "ALB" "Albania"     2004  . .
    3 "ALB" "Albania"     2005  . .
    3 "ALB" "Albania"     2006  . .
    3 "ALB" "Albania"     2007  . .
    3 "ALB" "Albania"     2008  . .
    3 "ALB" "Albania"     2009  5 .
    3 "ALB" "Albania"     2010 15 .
    3 "ALB" "Albania"     2011 27 .
    3 "ALB" "Albania"     2012  . .
    3 "ALB" "Albania"     2013  . .
    3 "ALB" "Albania"     2014  . .
    3 "ALB" "Albania"     2015  . .
    3 "ALB" "Albania"     2016  . .
    3 "ALB" "Albania"     2017  . .
    4 "ARG" "Argentina"   2000  . .
    4 "ARG" "Argentina"   2001  . .
    4 "ARG" "Argentina"   2002  . .
    4 "ARG" "Argentina"   2003  . .
    4 "ARG" "Argentina"   2004  . .
    4 "ARG" "Argentina"   2005  . .
    4 "ARG" "Argentina"   2006 10 .
    4 "ARG" "Argentina"   2007  . .
    4 "ARG" "Argentina"   2008  . .
    4 "ARG" "Argentina"   2009  . .
    4 "ARG" "Argentina"   2010  . .
    4 "ARG" "Argentina"   2011  . .
    4 "ARG" "Argentina"   2012 18 .
    4 "ARG" "Argentina"   2013  . .
    4 "ARG" "Argentina"   2014  . .
    4 "ARG" "Argentina"   2015  . .
    4 "ARG" "Argentina"   2016  . .
    4 "ARG" "Argentina"   2017  . .
    5 "ARM" "Armenia"     2000 33 .
    5 "ARM" "Armenia"     2001  7 .
    5 "ARM" "Armenia"     2002 21 .
    5 "ARM" "Armenia"     2003 23 .
    5 "ARM" "Armenia"     2004  8 .
    5 "ARM" "Armenia"     2005  3 .
    5 "ARM" "Armenia"     2006  9 .
    5 "ARM" "Armenia"     2007  5 .
    5 "ARM" "Armenia"     2008  4 .
    5 "ARM" "Armenia"     2009 23 .
    5 "ARM" "Armenia"     2010 16 .
    5 "ARM" "Armenia"     2011 28 .
    5 "ARM" "Armenia"     2012 21 .
    5 "ARM" "Armenia"     2013 26 .
    5 "ARM" "Armenia"     2014 10 .
    5 "ARM" "Armenia"     2015 20 .
    5 "ARM" "Armenia"     2016 10 .
    5 "ARM" "Armenia"     2017 34 .
    6 "AUS" "Australia"   2000 34 .
    6 "AUS" "Australia"   2001 17 .
    6 "AUS" "Australia"   2002 16 .
    6 "AUS" "Australia"   2003 16 .
    6 "AUS" "Australia"   2004 12 .
    6 "AUS" "Australia"   2005  2 .
    6 "AUS" "Australia"   2006 14 .
    6 "AUS" "Australia"   2007  6 .
    6 "AUS" "Australia"   2008 31 .
    6 "AUS" "Australia"   2009  2 .
    end
    label values ccode ccode
    label def ccode 1 "AFG", modify
    label def ccode 2 "AGO", modify
    label def ccode 3 "ALB", modify
    label def ccode 4 "ARG", modify
    label def ccode 5 "ARM", modify
    label def ccode 6 "AUS", modify
    Any help is appreciated!

    Many thanks,
    Craig


  • #2
    Well, with all those missing values for var1, you are going to have mostly missing values for results anyway. But not completely missing results. The error in your code is that you forgot to put Year in parentheses in your -by- prefix:
    Code:
    bysort ccode (Year): gen var1_5CAGR = ((var1/L5.var1)^(1/5)) - 1
    does it.

    Comment


    • #3
      Thank you, Clyde! It works perfectly.

      Comment


      • #4
        In case data cannot be declared as panel, here is a solution based on asrol(SSC)
        Code:
        ssc install asrol
        
        bys ccode (Year): gen change = (var1 / var1[_n-1]) - 1
        
        bys ccode : asrol change , stat(gmean) wind(Year 5) add(1) min(5)
        
        list in 7/17
        
             +-----------------------------------------------------------------------------------+
             | ccode   Countr~e       Country   Year   var1   var1_5C~R      change   gmean5_c~e |
             |-----------------------------------------------------------------------------------|
          7. |   AFG        AFG   Afghanistan   2006     16   -.0436475   -.4285714   -.04364751 |
          8. |   AFG        AFG   Afghanistan   2007     29    .1263039       .8125    .12630393 |
          9. |   AFG        AFG   Afghanistan   2008     27     .465078   -.0689655    .46507802 |
         10. |   AFG        AFG   Afghanistan   2009     32    .2619147    .1851852    .26191468 |
         11. |   AFG        AFG   Afghanistan   2010     14   -.1294494      -.5625   -.12944944 |
             |-----------------------------------------------------------------------------------|
         12. |   AFG        AFG   Afghanistan   2011     13   -.0406774   -.0714286   -.04067741 |
         13. |   AFG        AFG   Afghanistan   2012     32    .0198831    1.461538     .0198831 |
         14. |   AFG        AFG   Afghanistan   2013     27           0     -.15625   -3.947e-09 |
         15. |   AFG        AFG   Afghanistan   2014     10   -.2075534   -.6296296    -.2075534 |
         16. |   AFG        AFG   Afghanistan   2015     32    .1797891         2.2    .17978914 |
             |-----------------------------------------------------------------------------------|
         17. |   AFG        AFG   Afghanistan   2016     31    .1898266     -.03125     .1898266 |
             +-----------------------------------------------------------------------------------+
        Last edited by Attaullah Shah; 25 Aug 2020, 03:16.
        Regards
        --------------------------------------------------
        Attaullah Shah, PhD.
        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
        FinTechProfessor.com
        https://asdocx.com
        Check out my asdoc program, which sends outputs to MS Word.
        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

        Comment


        • #5
          The solution in #4 by using subscripts assumes that the previous value in a panel is always that for lag 1. If that's always true then there is no barrier to declaring the data as panel data.

          Other way round, what would be impossible as panel data acceptable to tsset or xtset would be two or more observations for the same identifier and time, in which case #4 would give some incorrect results.

          I guess we're all fine with the idea that tsset or xtset is not absolutely essential for calculating growth rates, but we need to be careful with assumptions if we use a subscript framework,

          I don't see any gaps in the dataset in #1 but only yesterday we had someone with gaps that they didn't spot before a tsset. so watch out.

          So, the solution in #4 is good precisely when a solution hinging on
          tsset and xtset would be good, but there is no set-up in which it is better. . This is not about asrol but entirely about what the previous line

          Code:
           bys ccode (Year): gen change = (var1 / var1[_n-1]) - 1
          will and won't do. It won't ignore gaps or repeated values for the same time.

          NOTE A quite different substantive issue is that in #4 Attaullah Shah is averaging year-on-year change using the recipe indicated, not basing a calculation on interval end-points as in #1 or #2.
          Last edited by Nick Cox; 25 Aug 2020, 05:04.

          Comment

          Working...
          X