Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate new variable from another

    Good afternoon,

    I am doing a financial analysis using different observations from different years and individuals. In this sense, I got a variable called interest that shows the interest level in the corresponding year. So, for a specific year, we got the same value for different individuals. What I need is to create a new variable that shows the total interest from the year i up to the last year (minus the interest from the year i). That is, for the first year, the variable should be equal to the sum of all yearly interest minus the interest for that year (that is, the sum of the interest for the following years), while for the second year, the variable should be equal to the total sum of all years minus the interest of the first and second year of the data. I am going to put a table as an example:

    n Year Yearly Interest Total Interest from year+1
    1 2006 0.1 0.21 (that is 0.09 + 0.07 + 0.05)
    2 2006 0.1 0.21
    3 2006 0.1 0.21
    4 2007 0.09 0.12 (that is 0.07 + 0.05)
    5 2007 0.09 0.12
    6 2008 0.07 0.05 (that is 0.05)
    7 2008 0.07 0.05
    8 2009 0.05 0
    9 2009 0.05 0

    Considering that I got the 3 first columns in Stata, how can I create the fourth one? Thanks in advance

    Best,

    Alex


  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear*
    input int year float yearlyinterest
    2006  .1
    2006  .1
    2006  .1
    2007 .09
    2007 .09
    2008 .07
    2008 .07
    2009 .05
    2009 .05
    end
    
    //    VERIFY YEARLY INTEREST IS CONSTANT WITHIN YEAR
    by year (yearlyinterest), sort: assert yearlyinterest[1] == yearlyinterest[_N]
    
    //    CALCULATE THE TOTAL DESIRED
    frame put year yearlyinterest, into(working)
    frame working {
        duplicates drop
        gsort -year
        gen wanted = sum(yearlyinterest)
        replace wanted = wanted - yearlyinterest
        sort year
    }
    
    //    BRING IT INTO THE ORIGINAL DATA SET
    frlink m:1 year, frame(working)
    frget wanted, from(working)
    drop working
    frame drop working
    In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    ADDED: The above code has the advantage of using only native Stata commands. It can be done more simply, however, if you install Robert Picard, Nick Cox, and Roberto Ferrer's -rangestat- command, available from SSC:
    Code:
    //    VERIFY YEARLY INTEREST IS CONSTANT WITHIN YEAR
    by year (yearlyinterest), sort: assert yearlyinterest[1] == yearlyinterest[_N]
    
    by year: gen summand = yearlyinterest if _n == 1
    rangestat (sum) wanted = summand, interval(year 1 .)
    replace wanted = 0 if missing(wanted)
    drop summand
    This code is shorter, perhaps more transparent, and does not require the use of frames.
    Last edited by Clyde Schechter; 05 Nov 2024, 11:55.

    Comment


    • #3
      Dear Clyde Schechter,

      First of all, I want to apologize for not using the correct format to write my post. I am sorry. I have just installed the dataex command as you recommended. Next time I ask for help, using data as an example, I will do it correctly. Thanks, for explaining to me how to do it properly.

      By the way, thank you for the code; it worked perfectly! I used the shorter one, and it was simple and efficient. Thanks again.

      I got an extra question, about the same data set. If, instead of the sum of interests, what I need is the multiplication, how can I do it? That is, for year j I need to calculate the multiplication of the yearly interests from year (j+1) up to year n (being n the last year in my sample). That is, assuming that I got years 2005, 2006, 2007, 2008 and 2009. I need the calculate the value for 2005 as the multiplication of values from 2006 up to 2009. For 2007 I need to do the same but only with years from 2007 up to 2009 and go on...

      Thanks in advance,

      Comment


      • #4
        If the values of yearlyinterest are always positive, then I think the simplest way is to use the code for sums shown earlier in this thread on the logarithms of yearlyinterest and then exponentiate the results.

        If, however, zero or negative values are possible, then I would do it this way:
        Code:
        gen `c(obs_t)' obs_no = _n
        by year (obs_no), sort: gen factor = cond(_n == 1, yearlyinterest, 1)
        gsort -year obs_no
        gen product = factor in 1
        replace product = product[_n-1]*factor in 2/L
        replace product = product/yearlyinterest
        sort year obs_no

        Comment


        • #5
          Dear Clyde Schechter,

          I want to thank you for your fast and reliable answer. It worked smoothly! I really appreciate your help. Thanks.

          Best,

          Comment

          Working...
          X