Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Beginner Stata User Help - consecutive years of donations variable

    Hello,

    I am trying to generate new variable, let's call it "current_year_loyalty", which records the number of years in a row a donor has made at least one donation. So for the first year in which they donate, current_year_loyalty would be assigned a value of 0. If they also made a donation the following year, current_year_loyalty would be assigned a value of 1. However, if they stop donating the year after that, and then resume in some later year, the current_year_loyalty would be reset back to 0.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long PPidm double FiscalYearGifts float FISCALYEAR
    70004546  88 2007
    70004546  45 2010
    70004546  45 2011
    70004546  45 2012
    70004546  45 2013
    70004546  45 2014
    70004546  75 2017
    70004546 175 2018
    end
    format %ty FISCALYEAR
    So in the above example, the user identified by Pidm 70004546 would be assigned the following values:
    • 2007: current_year_loyalty = 0
    • 2010: current_year_loyalty = 0
    • 2011: current_year_loyalty = 1
    • 2012: current_year_loyalty = 2
    • 2013: current_year_loyalty = 3
    • 2014: current_year_loyalty = 4
    • 2017: current_year_loyalty = 0 (reset to 0 here because they made no donations in 2015)
    • 2018: current_year_loyalty = 1
    Thank you

  • #2
    Hi Michael,

    Try the following:

    Code:
    tsset FISCALYEAR
    tsfill, full
    gen current_year_loyalty = 0
    replace current_year_loyalty = . if PPidm == .
    replace current_year_loyalty = (current_year_loyalty[_n-1] + 1) if (current_year_loyalty==0 & current_year_loyalty[_n-1]!=.)
    drop if PPidm==.

    Comment


    • #3
      Almost an FAQ https://www.stata.com/support/faqs/d...-observations/

      Our own Statalist FAQ suggests looking at the Stata FAQs before posting: it's the same advice for everyone.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long PPidm double FiscalYearGifts float FISCALYEAR
      70004546  88 2007
      70004546  45 2010
      70004546  45 2011
      70004546  45 2012
      70004546  45 2013
      70004546  45 2014
      70004546  75 2017
      70004546 175 2018
      end
      format %ty FISCALYEAR
      
      tsset PPidm FISCALYEAR 
      bysort PPidm (FISCALYEAR) : gen wanted = 0 if _n == 1 
      by PPidm : replace wanted = cond(L.Fiscal < ., wanted[_n-1] + 1, 0) if _n > 1 
      
      list, sep(0)
      
          +-----------------------------------------+
           |    PPidm   Fiscal~s   FISCAL~R   wanted |
           |-----------------------------------------|
        1. | 70004546         88       2007        0 |
        2. | 70004546         45       2010        0 |
        3. | 70004546         45       2011        1 |
        4. | 70004546         45       2012        2 |
        5. | 70004546         45       2013        3 |
        6. | 70004546         45       2014        4 |
        7. | 70004546         75       2017        0 |
        8. | 70004546        175       2018        1 |
           +-----------------------------------------+
      (I'd suggest leaving out stuff like "Beginner Stata User Help" from your titles. That isn't a way to get faster, more, better or even gentler help here. We can always look at what you ask and how many posts you've made and make a guess at what you most need, which might be an injunction to read the documentation.)

      Comment


      • #4
        Nick Cox Your code worked. Could you possibly explain what the L.Fiscal < . part is doing in your code? I believe I understand how everything else works. I will remove the "Beginner Stata User Help" from future titles, thanks for the advice.
        cond(L.Fiscal < ., wanted[_n-1] + 1, 0) if _n > 1

        Comment


        • #5
          This follows from the previously linked FAQ (see #3).

          If the previous observation is not for the previous time (otherwise put, there's a gap), then a call for the previous value can only result in a returned value of missing, just as if you tell me values for 2007 and 2010 I have no way to know 2009's value for certain.

          Code:
          L.fiscal < .
          is the complement of

          Code:
          L.fiscal == .
          Namely the less than operator < returns 1 (true) if the previous value is not missing, because there was no gap.

          So, < works the same way here as != (not equal) given that missing values are treated as arbitrarily large.

          The very distinguished statistician Tony Lachenbruch wrote an entire (but very short) paper on this point, but it's part of the Stata folklore.

          STB-9 ip2 . . . . . . . . . . . . . . . . . . . . . . . A keyboard shortcut
          . . . . . . . . . . . . . . . . . . . . . . . . . . P. A. Lachenbruch
          9/92 p.9; STB Reprints Vol 2, p.46 (no commands)
          keyboard shortcut to indicate nonmissing values

          Comment


          • #6
            One more way to go while you are still not familiar with tsset.
            Code:
            gen wanted=0
            bys PPidm (FISCALYEAR): replace wanted=wanted[_n-1]+1 if FISCALYEAR== FISCALYEAR[_n-1]+1

            Comment


            • #7
              Romalpa's code is elegant. At some point the tsset machinery will be what you need, but she's right that for this question you can be more direct.

              Comment


              • #8
                Great, I will study the code and documentation. Thank you both for all your help. Nick Cox Romalpa Akzo

                Comment

                Working...
                X