Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting longest period a variable assume a certain value

    Good morning I have a dataset of this type
    Year Company Var X
    1990 AAA 0
    1991 AAA 1
    1992 AAA 1
    1993 AAA 0
    1994 AAA 1
    I want to compute what is the longest period of time (consecutive years) for which Var X assumes values of 1. For example, according to the table this value should be 2 (Year 1991 and Year 1992).
    Anyone can help me with that?

    Thank You

  • #2
    ADDED IN EDIT: This applies to either spells of 0 or 1. For only spells of 1, see the second code.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year str3 company byte varx
    1990 "AAA" 0
    1991 "AAA" 1
    1992 "AAA" 1
    1993 "AAA" 0
    1994 "AAA" 1
    end
    
    bys company (year): gen spell_id= sum(1) if varx!= varx[_n-1]
    bys company (year): replace spell_id= spell_id[_n-1] if missing(spell_id) & !missing(spell_id[_n-1])
    bys company spell_id: gen longest= _N
    bys company (longest): replace longest= longest[_N]
    Res.:

    Code:
    . sort company year
    
    . l
    
         +--------------------------------------------+
         | year   company   varx   spell_id   longest |
         |--------------------------------------------|
      1. | 1990       AAA      0          1         2 |
      2. | 1991       AAA      1          2         2 |
      3. | 1992       AAA      1          2         2 |
      4. | 1993       AAA      0          3         2 |
      5. | 1994       AAA      1          4         2 |
         +--------------------------------------------+

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year str3 company byte varx
    1990 "AAA" 0
    1991 "AAA" 1
    1992 "AAA" 1
    1993 "AAA" 0
    1994 "AAA" 1
    1990 "BBB" 0
    1991 "BBB" 1
    1992 "BBB" 0
    1993 "BBB" 0
    1994 "BBB" 0
    end
    
    bys company (year): gen spell_id= sum(1) if varx!= varx[_n-1] & varx
    bys company (year): replace spell_id= spell_id[_n-1] if missing(spell_id) & !missing(spell_id[_n-1]) & varx
    bys company spell_id: gen longest= -_N if varx
    bys company (longest): replace longest= abs(longest[1])
    Res.:

    Code:
    . sort company year
    
    . l, sepby(company)
    
         +--------------------------------------------+
         | year   company   varx   spell_id   longest |
         |--------------------------------------------|
      1. | 1990       AAA      0          .         2 |
      2. | 1991       AAA      1          1         2 |
      3. | 1992       AAA      1          1         2 |
      4. | 1993       AAA      0          .         2 |
      5. | 1994       AAA      1          2         2 |
         |--------------------------------------------|
      6. | 1990       BBB      0          .         1 |
      7. | 1991       BBB      1          1         1 |
      8. | 1992       BBB      0          .         1 |
      9. | 1993       BBB      0          .         1 |
     10. | 1994       BBB      0          .         1 |
         +--------------------------------------------+
    Last edited by Andrew Musau; 10 Mar 2022, 06:51.

    Comment


    • #3
      Here's another way to do it using tsspell from SSC.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int year str3 company byte varx
      1990 "AAA" 0
      1991 "AAA" 1
      1992 "AAA" 1
      1993 "AAA" 0
      1994 "AAA" 1
      end
      
      egen COMPANY = group(company), label 
      
      tsset COMPANY year 
      
      tsspell, cond(varx == 1) 
      
      egen max = max(_seq), by(company) 
      
      tabdisp company, c(max)

      Comment


      • #4
        Originally posted by Andrew Musau View Post
        ADDED IN EDIT: This applies to either spells of 0 or 1. For only spells of 1, add ("&var[x]") to the -if- conditions.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input int year str3 company byte varx
        1990 "AAA" 0
        1991 "AAA" 1
        1992 "AAA" 1
        1993 "AAA" 0
        1994 "AAA" 1
        end
        
        bys company (year): gen spell_id= sum(1) if varx!= varx[_n-1]
        bys company (year): replace spell_id= spell_id[_n-1] if missing(spell_id) & !missing(spell_id[_n-1])
        bys company spell_id: gen longest= _N
        bys company (longest): replace longest= longest[_N]
        Res.:

        Code:
        . sort company year
        
        . l
        
        +--------------------------------------------+
        | year company varx spell_id longest |
        |--------------------------------------------|
        1. | 1990 AAA 0 1 2 |
        2. | 1991 AAA 1 2 2 |
        3. | 1992 AAA 1 2 2 |
        4. | 1993 AAA 0 3 2 |
        5. | 1994 AAA 1 4 2 |
        +--------------------------------------------+
        Hey Andrew,
        many thanks. I tried to implement what you said but it is not giving the desired output.
        It simply gives me the number of observation I have for a specific company regardless of the value of VarX.

        Any suggestions?

        Comment


        • #5
          Originally posted by Giovanni Verde View Post

          Hey Andrew,
          many thanks. I tried to implement what you said but it is not giving the desired output.
          It simply gives me the number of observation I have for a specific company regardless of the value of VarX.

          Any suggestions?
          I edited my response upon re-reading your original post. Note that I assume that "varx" is binary 0/1.

          Comment


          • #6
            Code:
            gen S = varx
            bys company (year): replace S = S[_n-1] + 1 if varx==1 & _n>1
            by company: egen maxS = max(S)

            Comment


            • #7
              If varx is just 0 and 1 Romalpa Akzo's characteristically concise and elegant solution can be slimmed further to

              Code:
               
               gen S = varx  bys company (year): replace S = S[_n-1] + 1 if varx==1 & _n>1  by company (S): replace S = S[_N]
              I will just flag that there is point to #3 as indicating more general technique and as yielding other spell results. More at https://www.stata-journal.com/articl...article=dm0029 if wanted.

              Comment


              • #8
                Nick's advice and the code in #3 are always my favourites. Practically, I find utilizing tsspell should be recommended for this type of issue, given that it is more generally applicable and easily adjusted or extended.

                My code in #6 (which indeed has been guided in the Stata article shared by Nick in #7) is more like my hobby of "puzzle style". Anyhow, for a more general context where varx may not merely be binary (0/1), a minor modification (in blue) would help it workable.

                Code:
                gen S = varx ==1
                bys company (year): replace S = S[_n-1] + 1 if varx==1 & _n>1
                bys company (S): replace S = S[_N]
                sort company year

                Comment

                Working...
                X