Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using the presence of a variable in a list as a condition

    Suppose I have four variables: age, income, race, and county. I want to summarize age and income, and tabulate race and county. Obviously there are easier ways to do this, but in the context of the problem I'm trying to solve, I need to create a list of variable names, and summarize one list, and tabulate the other list, all within a loop that runs through all variables. Here is the idea:

    HTML Code:
    foreach var in age income race county {
         loc sumvars age income
         loc tabvars race county
         */ sum `var' if `var' is contained in sumvars /*
         */ tab `var' if `var' is contained in tabvars /*
         }
    I've tried using "if" as both a qualifier and a command (still a little shaky on which to use in certain contexts), and I have been trying to use the inlist() command, but I'm still not getting what I want, so I must either be using it wrong or I'm barking up the wrong tree. I think my problem is just a matter of some random syntax I was never taught and can't find online, but nonetheless I'm stuck.

  • #2
    Code:
    capture summarize `var'
    will do nothing if `var' is not a variable in the dataset, and similarly for variables that may or may not know that.

    It's fine not to know that. It's not so fine to dismiss it as "random syntax" because you do not know that.

    Comment


    • #3
      The pseudo-code in #1 is complicated, and doesn't make a lot of sense to me. Focusing just on your verbal description of the problem , it seems to me that all you need is:

      Code:
      local summvars age income
      local tabvars race county
      
      summ `summvars'
      tab1 `tabvars'
      Last edited by Clyde Schechter; 02 May 2022, 10:29.

      Comment


      • #4
        I do not pretend to understand what you are trying to accomplish with your code, but your acknowledgement that there are simpler approaches suggests that perhaps you have oversimplified your statement of the problem. So with that in mind, the following code demonstrates a working version of the code in post #1.
        Code:
        sysuse auto, clear
        local sumvars price weight
        local tabvars foreign rep78
        foreach var in price weight foreign rep78 {
            local vs : list var & sumvars
            local vt : list var & tabvars
            if "`vs'"!="" {
                display _newline "===== `vs' ====="
                summarize `vs'
            }
            if "`vt'"!="" {
                display _newline "===== `vt' ====="
                tab `vt'
            }
        }
        Code:
        ===== price =====
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
               price |         74    6165.257    2949.496       3291      15906
        
        ===== weight =====
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
              weight |         74    3019.459    777.1936       1760       4840
        
        ===== foreign =====
        
         Car origin |      Freq.     Percent        Cum.
        ------------+-----------------------------------
           Domestic |         52       70.27       70.27
            Foreign |         22       29.73      100.00
        ------------+-----------------------------------
              Total |         74      100.00
        
        ===== rep78 =====
        
             Repair |
        record 1978 |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |          2        2.90        2.90
                  2 |          8       11.59       14.49
                  3 |         30       43.48       57.97
                  4 |         18       26.09       84.06
                  5 |         11       15.94      100.00
        ------------+-----------------------------------
              Total |         69      100.00

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          The pseudo-code in #1 is complicated, and doesn't make a lot of sense to me. Focusing just on your verbal description of the problem , it seems to me that all you need is:

          Code:
          local summvars age income
          local tabvars race county
          
          summ `summvars'
          tab `tabvars'
          Sorry about the complicated pseudo-cade. In an attempt to simplify and shorten my problem, I left out info which might've been helpful.

          In reality I am dealing with a list of about 100 variables, and within that I have a list of about 70 that need to be summarized, 20 that need to be tabulated, and 10 that I don't want any output for. It would be incredibly helpful if the output could be generated in the order that the 100 variables appear in the dataset. So the output would contain a summary and then a summary and then a tabulation and then a summary, and so on. I started with a code almost identical to yours, but it provided all the summaries and then all the tabulations.

          The reason for the contrived loop is because I thought that by looping through the variables in order, and telling stata whether to summarize or tab depending on the variables presence in a list, I could get the proper output for each variable and the proper ordering.

          Hopefully that makes sense and I appreciate the help.

          Comment


          • #6
            I think that is going to work William. Greatly appreciated! Well done deciphering my over-simplification.

            Comment

            Working...
            X