Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to count occurrence of string variables across row

    Please I am trying to count the occurrence of string variables across a row. Here is the example data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(country pid) byte wave double(q04_05_1 q04_05_2 q04_05_3 q04_05_4)
    51 5100006 1 1 2 3  4
    51 5100006 3 . . .  .
    51 5100006 2 . . .  .
    51 5100007 1 1 . .  .
    51 5100007 2 . . .  .
    51 5100007 3 . . .  .
    51 5100014 1 1 5 6 15
    51 5100014 2 . . .  .
    51 5100015 1 1 . .  .
    51 5100017 1 5 6 .  .
    end
    label values country Country
    label def country 51 "Peru", modify
    label values q04_05_1 q04_05_1
    label def q04_05_1 1 "advertir a los ciudadanos que se queden en casa", modify
    label def q04_05_1 5 "toque de queda / estado de emergenci / restricciÓn de salidas", modify
    label values q04_05_2 q04_05_2
    label def q04_05_2 2 "restringir los viajes dentro del paÍs", modify
    label def q04_05_2 5 "toque de queda / estado de emergenci / restricciÓn de salidas", modify
    label def q04_05_2 6 "cerrar los negocios o establecimientos no esenciales", modify
    label values q04_05_3 q04_05_3
    label def q04_05_3 3 "restringir los viajes internacionales / cierre de fronteras", modify
    label def q04_05_3 6 "cerrar los negocios o establecimientos no esenciales", modify
    label values q04_05_4 q04_05_4
    label def q04_05_4 4 "cerrar las escuelas y universidades", modify
    label def q04_05_4 15 "difundir medidas de reduccion del riesgo de contagio", modify
    I would like to create a new variable, say countmeasure, which is equal to 4 for pid 5100006 in wave 1 (because there are 4 responses, irrespective of what they are).

    I thought I might be able to use something like
    Code:
    egen countmeasure = anyvalue(q04_05_*)    if    q04_05_* != "."
    but it gives me an "invalid name" error.

    Please can anyone help? Thanks.

  • #2
    Chisom:
    if you're interested in -pid- 5100006 only:
    Code:
    gen wanted=4 if pid==5100006 & wave==1
    If you're interested in all -pid-:
    Code:
    bysort pid: gen wanted=4 if wave==1
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      There are no string variables here so testing for equality with strings will fail. It's just that your code failed first for another reason.

      anyvalue() is wrong as an egen function here: It takes a single variable name as argument and looks for one or more values inside. You supplied a varlist, which is why the code failed.

      What you need is the number of distinct values across a range of variables. There is no official egen function which will do that for you
      See https://www.stata-journal.com/articl...article=pr0046 and especially Section 7. Then download the code
      Code:
       ssc install egenmore
      With your data example (thanks!) I get this
      Code:
      egen wanted = rownvals(q04_05_?)  list, sepby(pid) nola       +-------------------------------------------------------------------------------+      | country       pid   wave   q04_05_1   q04_05_2   q04_05_3   q04_05_4   wanted |      |-------------------------------------------------------------------------------|   1. |      51   5100006      1          1          2          3          4        4 |   2. |      51   5100006      3          .          .          .          .        0 |   3. |      51   5100006      2          .          .          .          .        0 |      |-------------------------------------------------------------------------------|   4. |      51   5100007      1          1          .          .          .        1 |   5. |      51   5100007      2          .          .          .          .        0 |   6. |      51   5100007      3          .          .          .          .        0 |      |-------------------------------------------------------------------------------|   7. |      51   5100014      1          1          5          6         15        4 |   8. |      51   5100014      2          .          .          .          .        0 |      |-------------------------------------------------------------------------------|   9. |      51   5100015      1          1          .          .          .        1 |      |-------------------------------------------------------------------------------|  10. |      51   5100017      1          5          6          .          .        2 |      +-------------------------------------------------------------------------------+
      Note that
      Code:
      if q04_05_* != "."
      is fantasy syntax, as Stata doesn't support wildcards in true or false expressions. But checking for missings -- even numeric missings -- is not needed here, and worse, contrary to the spirit of what (I imagine) you want. In particular, you shouldn't want to ignore observations with any missing values, which is what your syntax would do if it worked at all.



      Comment


      • #4
        Hi Nick,

        Thanks a lot.
        Your solution:
        Code:
         
         egen wanted = rownvals(q04_05_?)
        worked perfectly for me.

        Comment


        • #5
          Here is what the results should look like


          Code:
          egen wanted = rownvals(q04_05_?)
          
          list, sepby(pid) nola
          
               +-------------------------------------------------------------------------------+
               | country       pid   wave   q04_05_1   q04_05_2   q04_05_3   q04_05_4   wanted |
               |-------------------------------------------------------------------------------|
            1. |      51   5100006      1          1          2          3          4        4 |
            2. |      51   5100006      3          .          .          .          .        0 |
            3. |      51   5100006      2          .          .          .          .        0 |
               |-------------------------------------------------------------------------------|
            4. |      51   5100007      1          1          .          .          .        1 |
            5. |      51   5100007      2          .          .          .          .        0 |
            6. |      51   5100007      3          .          .          .          .        0 |
               |-------------------------------------------------------------------------------|
            7. |      51   5100014      1          1          5          6         15        4 |
            8. |      51   5100014      2          .          .          .          .        0 |
               |-------------------------------------------------------------------------------|
            9. |      51   5100015      1          1          .          .          .        1 |
               |-------------------------------------------------------------------------------|
           10. |      51   5100017      1          5          6          .          .        2 |
               +-------------------------------------------------------------------------------+

          Comment

          Working...
          X