Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The lengthiest repeated sequence of choices

    My survey data reports the choices made by an interviewee from the given list of activities. In the actual data, the number of choices is large, up to 3000 while the number of listed activities is limited to 40 different activities. In the example below, the corresponding figures are just 99 (choices, reported in the order of time) and 5 (different activities, coded as 0 - 4).

    I need to capture the most lengthy repeated sequence(s) of choices. "Repeated" means the sequence is observed at least 2 times. In the example, the sequence "0-4-1-1-0-4" is reported (at least) 2 times: From time 6 to 11 and then reoccurs from 76 to 81. The length of this sequence is 6, and by manual checking, it seems that no lengthier sequence is found repeated.

    I care most about the length (6 in the example), but if possible, the specific pattern of sequence ("0-4-1-1-0-4") is desirable. I guess there might exist some different (repeated) patterns with the same length (6). If so, being able to just capture any of them is more than expected. Catching all of them could be over complicated?

    With my Stata limited skill, just taking out the length (6) from the maze is truly overwhelming. It is very much appreciated if anyone could give me a way to deal with that. Many thanks!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(time choice)
     1 0
     2 2
     3 0
     4 2
     5 4
     6 0
     7 4
     8 1
     9 1
    10 0
    11 4
    12 1
    13 4
    14 4
    15 0
    16 1
    17 4
    18 4
    19 0
    20 2
    21 0
    22 3
    23 0
    24 2
    25 2
    26 1
    27 0
    28 1
    29 0
    30 4
    31 0
    32 3
    33 4
    34 1
    35 2
    36 0
    37 0
    38 3
    39 3
    40 3
    41 3
    42 2
    43 4
    44 0
    45 3
    46 4
    47 1
    48 3
    49 2
    50 3
    51 2
    52 3
    53 2
    54 2
    55 4
    56 2
    57 4
    58 1
    59 1
    60 2
    61 0
    62 4
    63 3
    64 1
    65 3
    66 0
    67 0
    68 4
    69 3
    70 2
    71 4
    72 1
    73 0
    74 4
    75 1
    76 0
    77 4
    78 1
    79 1
    80 0
    81 4
    82 3
    83 3
    84 3
    85 1
    86 1
    87 0
    88 4
    89 1
    90 2
    91 4
    92 2
    93 1
    94 1
    95 3
    96 0
    97 2
    98 3
    99 4
    end

  • #2
    I wrote a program that allows you to count repeated spell of any length, that should work I think? See comments in the code for explanations.

    Code:
    * Turn choices to string
    gen choiceString = strofreal(choice)
    drop choice
    rename choiceString choice
    
    * Try for length of two
    gen combo2 = choice + "§" + choice[_n+1] if _n <= _N-2
    gen one = 1
    gegen count2 = total(one) if !missing(combo2), by(combo2)
    
    * Program
    cap program drop comboCounter
    program define comboCounter
        syntax, comboLength(integer) choice(varname)
        
        local loopLength = `comboLength' - 1    // E.g. for comboLenght of 4, we need choice[_n+0] up to choice[_n+3]
        local comboCode = "`choice'[_n+0]"
        forvalues i = 1/`loopLength' {
            local comboCode = `"`comboCode' + "§" + `choice'[_n+`i']"'
        }
        
        gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'        // The if-condition excludes variables at the end of the sample (which would not contain a full spell)
        
        tempvar one
        gen `one' = 1
        gegen count`comboLength' = total(`one') if !missing(combo`comboLength'), by(combo`comboLength')
        
        count if count`comboLength' > 1 & count`comboLength' != .        // number of spells present at least twice
    end
        
    ** Compare program to manual
    rename count2 count2_manual
    rename combo2 combo2_manual
    comboCounter, comboLength(2) choice(choice)
    assert count2 == count2_manual
    
    ** Run for length six
    comboCounter, comboLength(6) choice(choice)
    
    ** Run for length seven
    comboCounter, comboLength(7) choice(choice)

    Comment


    • #3
      A very simple approach gets the result you report directly. Note that spaces are needed for codes 10 up.

      Code:
      gen last6 = "" 
      quietly forval j = 6(-1)1 { 
      replace last6 = last6 + strofreal(choice[_n-`j']) 
      } 
      
      tab last6, sort  
      
            last6 |      Freq.     Percent        Cum.
      ------------+-----------------------------------
           041104 |          2        2.02        2.02
           ...... |          1        1.01        3.03
           .....0 |          1        1.01        4.04
           ....02 |          1        1.01        5.05
           ...020 |          1        1.01        6.06
           ..0202 |          1        1.01        7.07
           .02024 |          1        1.01        8.08
           003333 |          1        1.01        9.09
           004324 |          1        1.01       10.10
           010403 |          1        1.01       11.11
           014402 |          1        1.01       12.12
      with yet others

      Comment


      • #4
        Jesse Wursten and Nick Cox's solutions are valuable to me. I appreciate your help very much.

        In my understanding, if I have known that 6 would be the largest length of repeated sequences, then I could count how many times of repetition (the result of comboCounte in #2) or even tab out (all) repeated sequences (which have the frequency greater than 1 in # 3).

        However, the issue is I do not know that. In the given example, I just manually check to discover the sequence "0-4-1-1-0-4", and consequently establish that the length must be equal to or greater than 6. As a matter of fact, I am not sure whether it could be 7, or even up to 20, or more. In addition, manual checking becomes impossible for the actual data with a large number of observations.

        So, the key question is: What is the largest length of a repeated sequence? That is the figure that I need the most (before picking out the specific pattern, if possible).

        It seems that my previous explanation in #1 is not clear enough. And I am sorry for that. I hope this explanation might be explaining better for my desire. Kindly give/suggest me a solution. Many thanks!



        Comment


        • #5
          You can change 6 to 7 8 9 … until you no longer find repetitions.

          Comment


          • #6
            The matter is that I have hundreds of data surveys (each corresponding to a single interviewee). And the (largest) length of repeated sentences is varied, which could be as large as up to, let's say, 1000. Thus, manual work would not help.

            At present, to my limited skill, an additional loop (based on your suggestion in #3 or Jesse's in #2) might have. But I am still struggling with it.

            Comment


            • #7
              I updated the code to keep going until it reaches a length without repeated spells and then reports the outcome. Are you saying that this would then need to be ran separately for diferent interviewees? If so, please provide a data example with multiple interviewees and maybe I can adapt the code to that situation.

              Output
              Code:
              Repeated spell count for spells of length 2: 97
              Repeated spell count for spells of length 3: 57
              Repeated spell count for spells of length 4: 21
              Repeated spell count for spells of length 5: 10
              Repeated spell count for spells of length 6: 2
              Repeated spell count for spells of length 7: 0
              
              Maximum length with repeated spells: 6
              Code
              Code:
              clear
              input byte(time choice)
               1 0
               2 2
               3 0
               4 2
               5 4
               6 0
               7 4
               8 1
               9 1
              10 0
              11 4
              12 1
              13 4
              14 4
              15 0
              16 1
              17 4
              18 4
              19 0
              20 2
              21 0
              22 3
              23 0
              24 2
              25 2
              26 1
              27 0
              28 1
              29 0
              30 4
              31 0
              32 3
              33 4
              34 1
              35 2
              36 0
              37 0
              38 3
              39 3
              40 3
              41 3
              42 2
              43 4
              44 0
              45 3
              46 4
              47 1
              48 3
              49 2
              50 3
              51 2
              52 3
              53 2
              54 2
              55 4
              56 2
              57 4
              58 1
              59 1
              60 2
              61 0
              62 4
              63 3
              64 1
              65 3
              66 0
              67 0
              68 4
              69 3
              70 2
              71 4
              72 1
              73 0
              74 4
              75 1
              76 0
              77 4
              78 1
              79 1
              80 0
              81 4
              82 3
              83 3
              84 3
              85 1
              86 1
              87 0
              88 4
              89 1
              90 2
              91 4
              92 2
              93 1
              94 1
              95 3
              96 0
              97 2
              98 3
              99 4
              end
              
              * Turn choices to string
              gen choiceString = strofreal(choice)
              drop choice
              rename choiceString choice
              
              * Program
              cap program drop comboCounter
              program define comboCounter, rclass
                  syntax, comboLength(integer) choice(varname)
                  
                  local loopLength = `comboLength' - 1    // E.g. for comboLenght of 4, we need choice[_n+0] up to choice[_n+3]
                  local comboCode = "`choice'[_n+0]"
                  forvalues i = 1/`loopLength' {
                      local comboCode = `"`comboCode' + "§" + `choice'[_n+`i']"'
                  }
                  
                  qui gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'        // The if-condition excludes variables at the end of the sample (which would not contain a full spell)
                  
                  tempvar one
                  gen `one' = 1
                  gegen count`comboLength' = total(`one') if !missing(combo`comboLength'), by(combo`comboLength')
                  
                  qui count if count`comboLength' > 1 & count`comboLength' != .        // number of spells present at least twice
                  
                  return scalar repeatedSpellCount = r(N)
              end
                  
              * Find largest sequence
              local keepGoing = "true"
              local counter = 2
              while "`keepGoing'" == "true" {
                  cap drop combo* count*
                  comboCounter, comboLength(`counter') choice(choice)
                  di "Repeated spell count for spells of length `counter': " r(repeatedSpellCount)
                  
                  if r(repeatedSpellCount) == 0 {
                      local keepGoing "false"
                      di _newline "Maximum length with repeated spells: `=`counter'-1'" _newline
                  }
                  local counter = `counter' + 1
              }

              Comment


              • #8
                I appreciate your help very much. The output of your code, which looks very comprehensive and detail, appear to be exactly what I need.

                As regard different interviewees, the actual data includes hundreds of them. Each person has an unique ID and the same structure of choices (in order of time) as the given example. I have managed to save them separately into many single files (corresponding to each ID). Given your code run well for one file, I guess a simple loop of files, which I should be capable to do, would make the work done.

                I will look deeper with the code and run it for the whole data to see the results. If any issue might arise or any interesting feedback might appear, I would share them out here or seek for further help.

                Once again, thank you so much.

                Comment


                • #9
                  Organize the data in a "triangular" shape so that -duplicates- could be utilized. Below code can capture the wanted length and all the corresponding sequences.
                  Code:
                  gen v1 = choice
                  forval i = 2/`=_N' {
                      gen v`i'= v`=`i'-1'[_n+1]
                  }
                  
                  local j = 1
                  while `r(unique_value)'+1 <_N+1 {
                      qui duplicates report v1-v`++j'
                  }
                  
                  bys v1-v`=`j'-1': gen Length = `j'-1 if _N>1
                  egen sequence = concat(v1-v`=`j'-1') if Length !=., p(-)
                  
                  sort time
                  drop v*

                  Comment


                  • #10
                    Many thanks, Romalpa Akzo and Jesse Wursten. Both the codes in #8 and #9 work well with the actual data and the output are beyond the expectation (even though I do need more time to understand the algorisms behind them).

                    I am now seeking help for an "extended" issue. The survey data now includes many interviewees' choices, which are coded as 0-4 and given in the order of time. Again, I need to find out the largest "repeated" sequence of choices, but in the context of many interviewees, "repeated" means the sequence is observed at least one time per every id.

                    The example below includes only 2 ids, wherein one (id =1) has 99 choices and the other (id =2) has 89 choices. My best (manual) searching in this sample discovers the sequence 2-4-0-4-3, which is observed at time 61-65 of id =1 and time 59-63 of id =2. That means my guess for the largest length is 5. Does any longer sequence exist?

                    My desirable target is for the length, but it would be great if the sequence(s) could also be captured. Your kind suggestions for that are very much appreciated.
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input byte(id choice time)
                    1 2  1
                    1 4  2
                    1 0  3
                    1 1  4
                    1 1  5
                    1 2  6
                    1 1  7
                    1 1  8
                    1 0  9
                    1 4 10
                    1 4 11
                    1 0 12
                    1 1 13
                    1 2 14
                    1 3 15
                    1 0 16
                    1 1 17
                    1 4 18
                    1 2 19
                    1 2 20
                    1 2 21
                    1 0 22
                    1 3 23
                    1 3 24
                    1 4 25
                    1 2 26
                    1 1 27
                    1 2 28
                    1 4 29
                    1 2 30
                    1 3 31
                    1 2 32
                    1 0 33
                    1 3 34
                    1 4 35
                    1 3 36
                    1 4 37
                    1 4 38
                    1 2 39
                    1 1 40
                    1 2 41
                    1 0 42
                    1 2 43
                    1 1 44
                    1 2 45
                    1 4 46
                    1 4 47
                    1 4 48
                    1 0 49
                    1 3 50
                    1 2 51
                    1 2 52
                    1 0 53
                    1 1 54
                    1 3 55
                    1 3 56
                    1 4 57
                    1 4 58
                    1 3 59
                    1 2 60
                    1 2 61
                    1 4 62
                    1 0 63
                    1 4 64
                    1 3 65
                    1 0 66
                    1 0 67
                    1 0 68
                    1 1 69
                    1 0 70
                    1 4 71
                    1 3 72
                    1 1 73
                    1 3 74
                    1 0 75
                    1 3 76
                    1 3 77
                    1 0 78
                    1 1 79
                    1 2 80
                    1 1 81
                    1 3 82
                    1 0 83
                    1 3 84
                    1 4 85
                    1 3 86
                    1 4 87
                    1 2 88
                    1 2 89
                    1 4 90
                    1 4 91
                    1 2 92
                    1 1 93
                    1 4 94
                    1 0 95
                    1 0 96
                    1 1 97
                    1 2 98
                    1 3 99
                    2 2  1
                    2 4  2
                    2 3  3
                    2 0  4
                    2 3  5
                    2 4  6
                    2 4  7
                    2 3  8
                    2 3  9
                    2 3 10
                    2 0 11
                    2 1 12
                    2 3 13
                    2 0 14
                    2 4 15
                    2 0 16
                    2 3 17
                    2 3 18
                    2 4 19
                    2 4 20
                    2 3 21
                    2 3 22
                    2 2 23
                    2 3 24
                    2 2 25
                    2 4 26
                    2 3 27
                    2 3 28
                    2 1 29
                    2 0 30
                    2 3 31
                    2 3 32
                    2 3 33
                    2 0 34
                    2 2 35
                    2 1 36
                    2 1 37
                    2 4 38
                    2 3 39
                    2 4 40
                    2 2 41
                    2 4 42
                    2 0 43
                    2 1 44
                    2 3 45
                    2 3 46
                    2 1 47
                    2 2 48
                    2 1 49
                    2 2 50
                    2 1 51
                    2 3 52
                    2 1 53
                    2 3 54
                    2 4 55
                    2 4 56
                    2 2 57
                    2 4 58
                    2 2 59
                    2 4 60
                    2 0 61
                    2 4 62
                    2 3 63
                    2 4 64
                    2 0 65
                    2 3 66
                    2 1 67
                    2 4 68
                    2 4 69
                    2 3 70
                    2 1 71
                    2 3 72
                    2 2 73
                    2 0 74
                    2 0 75
                    2 3 76
                    2 2 77
                    2 4 78
                    2 3 79
                    2 0 80
                    2 2 81
                    2 1 82
                    2 4 83
                    2 1 84
                    2 3 85
                    2 4 86
                    2 3 87
                    2 2 88
                    2 0 89
                    end

                    Comment


                    • #11
                      I think this code does what you want. I've highlighted the changes in bold. This program will detect the largest sequence of choices, keeping into account that choices need to be made by the same interviewee.

                      Code:
                      * Turn choices to string
                      gen choiceString = strofreal(choice)
                      drop choice
                      rename choiceString choice
                      
                      * Program
                      cap program drop comboCounter
                      program define comboCounter, rclass
                          syntax, comboLength(integer) choice(varname) by(varlist)
                          
                          local loopLength = `comboLength' - 1    // E.g. for comboLenght of 4, we need choice[_n+0] up to choice[_n+3]
                          local comboCode = "`choice'[_n+0]"
                          forvalues i = 1/`loopLength' {
                              local comboCode = `"`comboCode' + "§" + `choice'[_n+`i']"'
                          }
                          
                          if "`by'" != "" {    
                              qui bysort `by': gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'        // The if-condition excludes variables at the end of the sample (which would not contain a full spell)
                          }
                          else {
                              qui gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'
                          }
                          
                          tempvar one
                          gen `one' = 1
                          gegen count`comboLength' = total(`one') if !missing(combo`comboLength'), by(combo`comboLength' `by')
                          
                          qui count if count`comboLength' > 1 & count`comboLength' != .        // number of spells present at least twice
                          
                          return scalar repeatedSpellCount = r(N)
                      end
                          
                      * Find largest sequence
                      local keepGoing = "true"
                      local counter = 2
                      while "`keepGoing'" == "true" {
                          cap drop combo* count*
                          comboCounter, comboLength(`counter') choice(choice) by(id)
                          di "Repeated spell count for spells of length `counter': " r(repeatedSpellCount)
                          
                          if r(repeatedSpellCount) == 0 {
                              local keepGoing "false"
                              local previousCounter = `counter' - 1
                              di _newline "Maximum length with repeated spells: `previousCounter'" _newline
                              
                              drop combo* count*
                              comboCounter, comboLength(`previousCounter') choice(choice) by(id)
                              list if (count`previousCounter' > 1) & (count`previousCounter' != .)
                              
                          }
                          local counter = `counter' + 1
                      }

                      Comment


                      • #12
                        Jesse Wursten, the code in #11 is valuable to me. It helps to capture the (largest) length and the pattern of the "repeated" sequence (at least 2 times) within the choices of each interviewee. That is a direct extension of the issue in #1 for many interviewees. It is helpful to avoid the quasi-manual loop for many separate ids.

                        However, as explained in #10, the "repeated" that I am seeking now means the sequence is observed at least 1 time per every id. In the given example, the sequence 2-4-0-4-3 is seen in the choices of both ids, time 61-65 of id =1 and time 59-63 of id =2. That is possibly the solution (given that it is just picked up by my manual checking). All that said, the wanted output in #10 is different from the one in #1.

                        Please advise me if further explanations for my desire is needed. Kindly suggest me a solution. Many thanks.

                        Comment


                        • #13
                          I think this will do what you require. Note that it found an extra sequence: 3§3§4§4§3

                          Code:
                          clear
                          input byte(id choice time)
                          1 2  1
                          1 4  2
                          1 0  3
                          1 1  4
                          1 1  5
                          1 2  6
                          1 1  7
                          1 1  8
                          1 0  9
                          1 4 10
                          1 4 11
                          1 0 12
                          1 1 13
                          1 2 14
                          1 3 15
                          1 0 16
                          1 1 17
                          1 4 18
                          1 2 19
                          1 2 20
                          1 2 21
                          1 0 22
                          1 3 23
                          1 3 24
                          1 4 25
                          1 2 26
                          1 1 27
                          1 2 28
                          1 4 29
                          1 2 30
                          1 3 31
                          1 2 32
                          1 0 33
                          1 3 34
                          1 4 35
                          1 3 36
                          1 4 37
                          1 4 38
                          1 2 39
                          1 1 40
                          1 2 41
                          1 0 42
                          1 2 43
                          1 1 44
                          1 2 45
                          1 4 46
                          1 4 47
                          1 4 48
                          1 0 49
                          1 3 50
                          1 2 51
                          1 2 52
                          1 0 53
                          1 1 54
                          1 3 55
                          1 3 56
                          1 4 57
                          1 4 58
                          1 3 59
                          1 2 60
                          1 2 61
                          1 4 62
                          1 0 63
                          1 4 64
                          1 3 65
                          1 0 66
                          1 0 67
                          1 0 68
                          1 1 69
                          1 0 70
                          1 4 71
                          1 3 72
                          1 1 73
                          1 3 74
                          1 0 75
                          1 3 76
                          1 3 77
                          1 0 78
                          1 1 79
                          1 2 80
                          1 1 81
                          1 3 82
                          1 0 83
                          1 3 84
                          1 4 85
                          1 3 86
                          1 4 87
                          1 2 88
                          1 2 89
                          1 4 90
                          1 4 91
                          1 2 92
                          1 1 93
                          1 4 94
                          1 0 95
                          1 0 96
                          1 1 97
                          1 2 98
                          1 3 99
                          2 2  1
                          2 4  2
                          2 3  3
                          2 0  4
                          2 3  5
                          2 4  6
                          2 4  7
                          2 3  8
                          2 3  9
                          2 3 10
                          2 0 11
                          2 1 12
                          2 3 13
                          2 0 14
                          2 4 15
                          2 0 16
                          2 3 17
                          2 3 18
                          2 4 19
                          2 4 20
                          2 3 21
                          2 3 22
                          2 2 23
                          2 3 24
                          2 2 25
                          2 4 26
                          2 3 27
                          2 3 28
                          2 1 29
                          2 0 30
                          2 3 31
                          2 3 32
                          2 3 33
                          2 0 34
                          2 2 35
                          2 1 36
                          2 1 37
                          2 4 38
                          2 3 39
                          2 4 40
                          2 2 41
                          2 4 42
                          2 0 43
                          2 1 44
                          2 3 45
                          2 3 46
                          2 1 47
                          2 2 48
                          2 1 49
                          2 2 50
                          2 1 51
                          2 3 52
                          2 1 53
                          2 3 54
                          2 4 55
                          2 4 56
                          2 2 57
                          2 4 58
                          2 2 59
                          2 4 60
                          2 0 61
                          2 4 62
                          2 3 63
                          2 4 64
                          2 0 65
                          2 3 66
                          2 1 67
                          2 4 68
                          2 4 69
                          2 3 70
                          2 1 71
                          2 3 72
                          2 2 73
                          2 0 74
                          2 0 75
                          2 3 76
                          2 2 77
                          2 4 78
                          2 3 79
                          2 0 80
                          2 2 81
                          2 1 82
                          2 4 83
                          2 1 84
                          2 3 85
                          2 4 86
                          2 3 87
                          2 2 88
                          2 0 89
                          end
                          
                          * Turn choices to string
                          gen choiceString = strofreal(choice)
                          drop choice
                          rename choiceString choice
                          
                          * Program
                          cap program drop comboCounter
                          program define comboCounter, rclass
                              syntax, comboLength(integer) choice(varname) by(varlist)
                              
                              local loopLength = `comboLength' - 1    // E.g. for comboLenght of 4, we need choice[_n+0] up to choice[_n+3]
                              local comboCode = "`choice'[_n+0]"
                              forvalues i = 1/`loopLength' {
                                  local comboCode = `"`comboCode' + "§" + `choice'[_n+`i']"'
                              }
                              
                              if "`by'" != "" {    
                                  qui bysort `by': gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'        // The if-condition excludes variables at the end of the sample (which would not contain a full spell)
                              }
                              else {
                                  qui gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'
                              }
                              
                              tempvar one
                              gen `one' = 1
                              gegen count`comboLength' = total(`one') if !missing(combo`comboLength'), by(combo`comboLength' `by')
                              
                              qui count if count`comboLength' > 1 & count`comboLength' != .        // number of spells present at least twice
                              
                              return scalar repeatedSpellCount = r(N)
                          end
                          
                          * Program
                          cap program drop comboIDCounter
                          program define comboIDCounter, rclass
                              syntax, comboLength(integer) choice(varname) by(varlist)
                              
                              local loopLength = `comboLength' - 1    // E.g. for comboLenght of 4, we need choice[_n+0] up to choice[_n+3]
                              local comboCode = "`choice'[_n+0]"
                              forvalues i = 1/`loopLength' {
                                  local comboCode = `"`comboCode' + "§" + `choice'[_n+`i']"'
                              }
                              
                              if "`by'" != "" {    
                                  qui bysort `by': gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'        // The if-condition excludes variables at the end of the sample (which would not contain a full spell)
                              }
                              else {
                                  qui gen combo`comboLength' = `comboCode' if _n <= _N - `comboLength'
                              }
                              
                              tempvar one scaled
                              gen `one' = 1
                              gegen count`comboLength' = total(`one') if !missing(combo`comboLength'), by(combo`comboLength' `by')
                              qui gen `scaled' = 1/count`comboLength'
                              
                              qui gegen presentInIdCount`comboLength' = total(`scaled'), by(combo`comboLength')
                              qui gunique `by'
                              local idCount = r(unique)
                              
                              qui gunique combo`comboLength' if presentInIdCount`comboLength' == `idCount'
                              
                              return scalar universalSpellCount = r(unique)
                              return scalar idCount = `idCount'
                          end
                              
                          * Find largest sequence
                          local keepGoing = "true"
                          local counter = 2
                          while "`keepGoing'" == "true" {
                              cap drop combo* count*
                              comboCounter, comboLength(`counter') choice(choice) by(id)
                              di "Repeated spell count for spells of length `counter': " r(repeatedSpellCount)
                              
                              if r(repeatedSpellCount) == 0 {
                                  local keepGoing "false"
                                  local previousCounter = `counter' - 1
                                  di _newline "Maximum length with repeated spells: `previousCounter'" _newline
                                  
                                  drop combo* count*
                                  comboCounter, comboLength(`previousCounter') choice(choice) by(id)
                                  list if (count`previousCounter' > 1) & (count`previousCounter' != .)
                                  
                              }
                              local counter = `counter' + 1
                          }
                          
                          * Find largest sequence observed in each ID
                          local keepGoing = "true"
                          local counter = 2
                          while "`keepGoing'" == "true" {
                              cap drop combo* count* presentInIdCount*
                              comboIDCounter, comboLength(`counter') choice(choice) by(id)
                              di "Spells present in each ID, length `counter': " r(universalSpellCount)
                              
                              if r(universalSpellCount) == 0 {
                                  local keepGoing "false"
                                  local previousCounter = `counter' - 1
                                  di _newline "Maximum length of universal spells: `previousCounter'" _newline
                                  
                                  drop combo* count* presentInIdCount*
                                  comboIDCounter, comboLength(`previousCounter') choice(choice) by(id)
                                  list if presentInIdCount`comboLength' == r(idCount)
                                  
                              }
                              local counter = `counter' + 1
                          }

                          Comment


                          • #14
                            Code:
                            bys id (time): gen CountChoice = _N
                            gen v1 = choice
                            
                            sum CountChoice, meanonly
                            forval i = 1/`r(min)' {
                                by id: gen v`=`i'+1' = v`i'[_n+1]
                            }
                            
                            levelsof id
                            local NumID = `r(r)'
                                
                            local j = 0
                            while `r(max)' + 1 ==  `NumID' + 1 | `j' == 0 {
                                bys v1-v`++j' (id): gen CountID`j' = sum(id != id[_n-1])
                                sum CountID`j', meanonly
                            }
                            
                            bys v1-v`=`j'-1' (CountID`=`j'-1'): gen Length = `j'-1 if CountID`=`j'-1'[_N] == `NumID'
                            egen Sequence = concat(v1-v`=`j'-1') if Length !=., p(-)
                            
                            sort id time
                            drop v* Count*
                            Output:
                            Code:
                            . list if Length!=., sepby(id) noobs
                            
                              +-----------------------------------------+
                              | id   choice   time   Length    Sequence |
                              |-----------------------------------------|
                              |  1        3     55        5   3-3-4-4-3 |
                              |  1        2     61        5   2-4-0-4-3 |
                              |-----------------------------------------|
                              |  2        3     17        5   3-3-4-4-3 |
                              |  2        2     59        5   2-4-0-4-3 |
                              +-----------------------------------------+

                            Comment


                            • #15
                              Just want to express my appreciation. The codes (#13, #14) are amazing.

                              Comment

                              Working...
                              X