Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing mean vector from "mean" command in a loop, what to do with "no observations"

    Hi STATA-list members,

    I am trying to run a large number of summary statistics (means) and input them into a matrix using Stata 13.0. For the first part of this process, I’m looping through two of my variables (age and birthyear) and storing the means into a vector. It works very well with the “summ” command since not every age/birthyear combination have an observation, and the scalar output from “summ” stores missings.

    The problem occurs when I run a second set using sampling weights. Since most convenient commands like tabstat/tabout/summ don’t work with “svy”, I am using the “mean” command. Now I get a warning “no observations” for some age/year combinations and it stops the loop. When I use the “cap” option, it continues but then stores the incorrect vector (e(b)) from whatever was the last non-missing output.

    Here is the code I am using:

    forval i=15/25 { // age
    forval j=1988/2000 { // birthyear
    cap mean inschool if age==`i' & birthyear==`j' [pweight=iweight]
    mat school_`i'_`j'= e(b)
    }
    }

    I would be grateful for any advice on how to correctly automate this. Thank you!


    Best,
    Crystal

  • #2
    Try this:

    Code:
    forval i=15/25 { // age
        forval j=1988/2000 { // birthyear
            quietly count if age == `i' & birthyear == `j'
            if r(N) > 0 {
                mean inschool if age==`i' & birthyear==`j' [pweight=iweight]
                mat school_`i'_`j'= e(b)
            }
            else {
                mat school_`i'_`j' = .
            }
        }
    }
    Note: This will specifically look for situations where there are no observations with age == `i' & birthyear == `j' and will appropriately handle them. If there is some other problem that blocks -mean- from running (say an illegal value of the weight, or all zero weights for a combination of age and birthyear), you will get an error message and it will stop.

    I generally prefer this targeted approach to the use of -capture-, so that if there is some problem in my data that I was not aware of, I discover it before it corrupts my results. Nonetheless, if there are several possible failure modes and it is too cumbersome to enumerate them all, a less discriminating approach is:

    Code:
    forval i=15/25 { // age
        forval j=1988/2000 { // birthyear
            capture mean inschool if age==`i' & birthyear==`j' [pweight=iweight]
            if c(rc) == 0 {
                mat school_`i'_`j'= e(b)
            }
            else {
                mat school_`i'_`j' = .
            }
        }
    }
    This will keep going regardless of the cause of failure of -mean-. If you are confident that -mean- will not fail for any reason that would call for a response other than skipping to the next iteration, then you can use this instead (at your own risk)

    Comment


    • #3
      Hi Clyde,

      Thank you so much for your detailed response. The added lines make sense to me- I am guessing c(rc) is the "capture" command return code, and is zero when there is no failure? This is incredibly useful and adds to my STATA programming knowledge.

      Thank you again.

      Crystal

      Comment


      • #4
        I am guessing c(rc) is the "capture" command return code, and is zero when there is no failure?
        That is correct. And, I agree, it is incredibly useful.

        Comment

        Working...
        X