Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Get the number of unique visits by Districts

    Dear All

    Can you all help with a code to get the number of unique visits in each district. For example I have a dataset where the number of rows is the number of places visited for each district but for the same visit on the same month the number of rows is more than one since the places are different. How can I use the information on day month and year to get the number of unique visits made by a person. So for example if the year is 2015, month is 10 and dates are 16, 18, 20, 23.. the number of visit should be one taking a threshold of 5 days. that is If the data of visit for the next place is within 5 days or less it should be counted as 1 visit and if its more than 5 then 2 visits. Can you help me with a code to do that.

    example of my data set

    copy starting from the next line ------ ----------------
    Code:
    * Example generated by -dataex-. To install: ssc install    dataex
    clear
    input byte(district q1) int(q2 q3)
    1 28 12 2016
    1  9  5 2018
    2  3  3 2007
    2 26 10 2015
    2 15  9 2015
    2 19 10 2015
    2 26 10 2015
    2 25 10 2015
    2 21 10 2015
    2 18 10 2015
    2 18 10 2015
    2 23 10 2015
    2 16 10 2015
    2 19 10 2015
    2 19 10 2015
    2  9 10 2019
    3 21 11 2012
    3 22 11 2012
    3 21 11 2012
    3 30 11 2013
    3 30  7 2013
    3  1  4 2014
    3  1  4 2014
    4 25 10 2016
    4 16 10 2016
    4  8  8 2018
    4  6  7 2018
    5 20  9 2011
    5 23 12 2017
    5  7  9 2017
    5 19 12 2017
    5 17 12 2017
    5 14 10 2017
    5 22 12 2017
    5 13 10 2017
    5 22 12 2017
    5 15  9 2017
    5 21 12 2017
    5 16 12 2017
    5 20 12 2017
    6 12  3 2008
    6 31  3 2009
    6 24  3 2009
    6 14  3 2009
    6 27  3 2009
    6 30  3 2009
    6 30  3 2009
    6 12  3 2009
    6 26  3 2009
    6 25  3 2009
    6 28  3 2009
    6 26  3 2009
    6 11  3 2009
    6 25  3 2009
    6 16  3 2009
    6 25  3 2009
    6 29  3 2009
    6 13  3 2009
    6 25  3 2009
    6 31  3 2009
    6 12  3 2009
    6 26  3 2009
    6 14  3 2009
    6 17  3 2009
    6 25  3 2009
    6 14  3 2009
    6 14  3 2009
    6 14  3 2009
    6 15  3 2009
    6 21  1 2015
    6 21  1 2015
    6 21  1 2015
    6 13  6 2016
    6 13  5 2018
    6 14  5 2018
    6 13  5 2018
    6 12  5 2018
    6 11  5 2018
    7 12  6 2009
    7 13  6 2009
    7 13  6 2009
    7 12  6 2009
    7 14  6 2009
    7 11  6 2009
    7 14  6 2009
    7 11 11 2009
    7 10  1 2010
    7 20  3 2010
    7  5  7 2014
    7  5  2 2014
    7 22  6 2016
    7 22  4 2017
    7 21  4 2017
    7 16  1 2018
    7  8  1 2018
    7 10  5 2018
    7 14  1 2018
    7 21  8 2018
    7 27  5 2018
    7 13  1 2018
    end
    label values district dcode

  • #2
    It appears that no one wants a piece of this. Check for logical errors in my code.

    Code:
    *CREATE STATA DATE
    gen date = mdy(q2, q1, q3)
    format date %td
    *COUNT DIFFERENCE IN DAYS BETWEEN VISITS TO THE SAME DISTRICT
    bys district (date): gen diff= date- date[_n-1] if _n>1
    *DIFFERENCES LONGER THAN 5 DAYS ARE NOT USEFUL FOR THE TASK AT HAND
    replace diff=. if diff>5
    *ADD UP CONSECUTIVE DIFFERENCES OF LESS THAN 5 DAYS
    bys district(date):replace diff= diff[_n-1]+diff if !missing(diff) &!missing(diff[_n-1])
    *ASSIGN VALUE 0 TO LESS THAN 4 DAYS, 1 T0 5-9, 2 TO 10-14, AND SO ON
    gen tag = floor(diff/5)
    *NOW COUNT ONLY DISTINCT VALUES OF THE ABOVE EXCLUDING 0
    bys district (date): gen fob= tag!=tag[_n-1] & tag<. & tag>0
    *BRING BACK THE LENGTHS > 5 DAYS
    replace fob=1 if missing(diff)
    *ADD UP TO GET WANTED VARIABLE
    bys district(date):egen wanted= total(fob)
    Res.:

    Code:
    . l, sepby(dis)
    
         +-------------------------------------------------------------------+
         | district   q1   q2     q3        date   diff   tag   fob   wanted |
         |-------------------------------------------------------------------|
      1. |        1   28   12   2016   28dec2016      .     .     1        2 |
      2. |        1    9    5   2018   09may2018      .     .     1        2 |
         |-------------------------------------------------------------------|
      3. |        2    3    3   2007   03mar2007      .     .     1        6 |
      4. |        2   15    9   2015   15sep2015      .     .     1        6 |
      5. |        2   16   10   2015   16oct2015      .     .     1        6 |
      6. |        2   18   10   2015   18oct2015      2     0     0        6 |
      7. |        2   18   10   2015   18oct2015      2     0     0        6 |
      8. |        2   19   10   2015   19oct2015      3     0     0        6 |
      9. |        2   19   10   2015   19oct2015      3     0     0        6 |
     10. |        2   19   10   2015   19oct2015      3     0     0        6 |
     11. |        2   21   10   2015   21oct2015      5     1     1        6 |
     12. |        2   23   10   2015   23oct2015      7     1     0        6 |
     13. |        2   25   10   2015   25oct2015      9     1     0        6 |
     14. |        2   26   10   2015   26oct2015     10     2     1        6 |
     15. |        2   26   10   2015   26oct2015     10     2     0        6 |
     16. |        2    9   10   2019   09oct2019      .     .     1        6 |
         |-------------------------------------------------------------------|
     17. |        3   21   11   2012   21nov2012      .     .     1        4 |
     18. |        3   21   11   2012   21nov2012      0     0     0        4 |
     19. |        3   22   11   2012   22nov2012      1     0     0        4 |
     20. |        3   30    7   2013   30jul2013      .     .     1        4 |
     21. |        3   30   11   2013   30nov2013      .     .     1        4 |
     22. |        3    1    4   2014   01apr2014      .     .     1        4 |
     23. |        3    1    4   2014   01apr2014      0     0     0        4 |
         |-------------------------------------------------------------------|
     24. |        4   16   10   2016   16oct2016      .     .     1        4 |
     25. |        4   25   10   2016   25oct2016      .     .     1        4 |
     26. |        4    6    7   2018   06jul2018      .     .     1        4 |
     27. |        4    8    8   2018   08aug2018      .     .     1        4 |
         |-------------------------------------------------------------------|
     28. |        5   20    9   2011   20sep2011      .     .     1        6 |
     29. |        5    7    9   2017   07sep2017      .     .     1        6 |
     30. |        5   15    9   2017   15sep2017      .     .     1        6 |
     31. |        5   13   10   2017   13oct2017      .     .     1        6 |
     32. |        5   14   10   2017   14oct2017      1     0     0        6 |
     33. |        5   16   12   2017   16dec2017      .     .     1        6 |
     34. |        5   17   12   2017   17dec2017      1     0     0        6 |
     35. |        5   19   12   2017   19dec2017      3     0     0        6 |
     36. |        5   20   12   2017   20dec2017      4     0     0        6 |
     37. |        5   21   12   2017   21dec2017      5     1     1        6 |
     38. |        5   22   12   2017   22dec2017      6     1     0        6 |
     39. |        5   22   12   2017   22dec2017      6     1     0        6 |
     40. |        5   23   12   2017   23dec2017      7     1     0        6 |
         |-------------------------------------------------------------------|
     41. |        6   12    3   2008   12mar2008      .     .     1        8 |
     42. |        6   11    3   2009   11mar2009      .     .     1        8 |
     43. |        6   12    3   2009   12mar2009      1     0     0        8 |
     44. |        6   12    3   2009   12mar2009      1     0     0        8 |
     45. |        6   13    3   2009   13mar2009      2     0     0        8 |
     46. |        6   14    3   2009   14mar2009      3     0     0        8 |
     47. |        6   14    3   2009   14mar2009      3     0     0        8 |
     48. |        6   14    3   2009   14mar2009      3     0     0        8 |
     49. |        6   14    3   2009   14mar2009      3     0     0        8 |
     50. |        6   14    3   2009   14mar2009      3     0     0        8 |
     51. |        6   15    3   2009   15mar2009      4     0     0        8 |
     52. |        6   16    3   2009   16mar2009      5     1     1        8 |
     53. |        6   17    3   2009   17mar2009      6     1     0        8 |
     54. |        6   24    3   2009   24mar2009      .     .     1        8 |
     55. |        6   25    3   2009   25mar2009      1     0     0        8 |
     56. |        6   25    3   2009   25mar2009      1     0     0        8 |
     57. |        6   25    3   2009   25mar2009      1     0     0        8 |
     58. |        6   25    3   2009   25mar2009      1     0     0        8 |
     59. |        6   25    3   2009   25mar2009      1     0     0        8 |
     60. |        6   26    3   2009   26mar2009      2     0     0        8 |
     61. |        6   26    3   2009   26mar2009      2     0     0        8 |
     62. |        6   26    3   2009   26mar2009      2     0     0        8 |
     63. |        6   27    3   2009   27mar2009      3     0     0        8 |
     64. |        6   28    3   2009   28mar2009      4     0     0        8 |
     65. |        6   29    3   2009   29mar2009      5     1     1        8 |
     66. |        6   30    3   2009   30mar2009      6     1     0        8 |
     67. |        6   30    3   2009   30mar2009      6     1     0        8 |
     68. |        6   31    3   2009   31mar2009      7     1     0        8 |
     69. |        6   31    3   2009   31mar2009      7     1     0        8 |
     70. |        6   21    1   2015   21jan2015      .     .     1        8 |
     71. |        6   21    1   2015   21jan2015      0     0     0        8 |
     72. |        6   21    1   2015   21jan2015      0     0     0        8 |
     73. |        6   13    6   2016   13jun2016      .     .     1        8 |
     74. |        6   11    5   2018   11may2018      .     .     1        8 |
     75. |        6   12    5   2018   12may2018      1     0     0        8 |
     76. |        6   13    5   2018   13may2018      2     0     0        8 |
     77. |        6   13    5   2018   13may2018      2     0     0        8 |
     78. |        6   14    5   2018   14may2018      3     0     0        8 |
         |-------------------------------------------------------------------|
     79. |        7   11    6   2009   11jun2009      .     .     1       13 |
     80. |        7   12    6   2009   12jun2009      1     0     0       13 |
     81. |        7   12    6   2009   12jun2009      1     0     0       13 |
     82. |        7   13    6   2009   13jun2009      2     0     0       13 |
     83. |        7   13    6   2009   13jun2009      2     0     0       13 |
     84. |        7   14    6   2009   14jun2009      3     0     0       13 |
     85. |        7   14    6   2009   14jun2009      3     0     0       13 |
     86. |        7   11   11   2009   11nov2009      .     .     1       13 |
     87. |        7   10    1   2010   10jan2010      .     .     1       13 |
     88. |        7   20    3   2010   20mar2010      .     .     1       13 |
     89. |        7    5    2   2014   05feb2014      .     .     1       13 |
     90. |        7    5    7   2014   05jul2014      .     .     1       13 |
     91. |        7   22    6   2016   22jun2016      .     .     1       13 |
     92. |        7   21    4   2017   21apr2017      .     .     1       13 |
     93. |        7   22    4   2017   22apr2017      1     0     0       13 |
     94. |        7    8    1   2018   08jan2018      .     .     1       13 |
     95. |        7   13    1   2018   13jan2018      5     1     1       13 |
     96. |        7   14    1   2018   14jan2018      6     1     0       13 |
     97. |        7   16    1   2018   16jan2018      8     1     0       13 |
     98. |        7   10    5   2018   10may2018      .     .     1       13 |
     99. |        7   27    5   2018   27may2018      .     .     1       13 |
    100. |        7   21    8   2018   21aug2018      .     .     1       13 |
         +-------------------------------------------------------------------+
    Last edited by Andrew Musau; 21 Nov 2019, 10:50.

    Comment


    • #3
      For me, the logic itself is a very straight one. Or I misunderstand something here?
      Code:
      gen date = mdy(q2, q1, q3)
      bys district (date): egen count = total((date-date[_n-1]) > 5)
      Last edited by Romalpa Akzo; 21 Nov 2019, 12:00.

      Comment


      • #4
        Romalpa Akzo Note that the help for egen explicitly advises against use of subscripts. I haven't time to read this thread carefully but in case it is pertinent I flag the existence of panelthin on SSC.

        Comment


        • #5
          You may be right Romalpa Akzo that that's what is intended. The OP states

          ...5 days or less it should be counted as 1 visit and if its more than 5 then 2 visits.
          My code calculates the cumulative length between (multiple) visits whereas yours is the absolute length,

          Comment


          • #6
            Thanks, Nick Cox sensei, for your advice.

            1. Given the explicit caution in the help of egen against the use of subscripts (as pointed out by Nick), the code in #3 should be modified as below:
            Code:
            gen date = mdy(q2, q1, q3)
            bys district (date): gen tag = (date-date[_n-1]) > 5
            bys district (date): egen count = total(tag)
            An alternative could be:
            Code:
            *gen date = mdy(q2, q1, q3)
            bys district (date): gen count2 = sum((date-date[_n-1]) > 5)
            bys district (date): replace count2 = count2[_N]
            Notice: Despite the code in #3 seems working (at least for the given sample), this modification is necessary to avoid any unexpected output.

            2. Also thanks for your introduction on panelthin, which is found quite interesting and useful. I have tried it for this specific issue, noticing that it is not applicable (at least for the given sample) due to the repeated time values within panel.

            3. Andrew Musau, I got your point. My solution is just a guess from what is explained in #1. Hope it could give Anna a hint.

            Comment


            • #7
              Thank u so much everyone .its working just fine

              Comment

              Working...
              X