Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summary statistics and missing values for a local in a loop

    Hi. I have the following local which comprises of some continuous variables and some categorical variables. I need to generate summary statistics, mean and SD, along with the number of missing observations for each of these by a group variable “armcd”. I can do it for each variable individually as shown below, but was wondering if there is anyway I can do this through a loop to generate the summary stats, the number of missing values, and export them to excel.

    Code:
    local chars racecat1 age stkcat1n sscat2n sexn hisstktian medschcatn rucacat2n dc_ref dc_refhh pcpfln dualcatn medhisdpn medhishtn medhissmn anycvdn
    bysort armcd: sum age
    bysort armcd: tab racecat1n

  • #2
    By "excel" I imagine you mean MS Excel. I don't advise on export to that application, but your question is still a bit fuzzy otherwise, because I sense that you may want to push "continuous" variables through summarize and "categorical" variables through tabulate.

    But you could do something like this


    Code:
    sort armcd 
    foreach v in `chars' { 
        by armcd : su `v' 
        count if missing(`v') 
    }

    Comment


    • #3
      Nick Cox Thanks for your quick response. You are right, I do want to push "continuous" variables through summarize and "categorical" variables through tabulate. Is there anyway for stata to identify which variables are continuous and categorical in the loop and accordingly use summarize or tabulate?

      And if not excel, what would you advise exporting to and how may one do that in this loop?

      Comment


      • #4
        The question is backwards. What is your criterion for distinguishing categorical variables from any other kind? Given such a criterion, there could be corresponding code.

        Categorical and continuous aren't Stata properties, except indirectly in terms of rules on factor variables. Neither storage type nor the number of distinct values nor the existence of value labels is necessarily diagnostic.


        "I don't advise on export to that application" meant "I don't offer advice ...." because there are so many ways to do it, and I don't use MS Excel routinely any way. Sorry that was unclear, but "advise on" are the key words.

        Comment


        • #5
          Thanks, Nick. This was very helpful!

          Comment

          Working...
          X