Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • statsby in combination with cii proportions command

    Hi there
    I have a dataset in format with about 50 rows:
    groupname freq grouptotal
    a 5 100
    b 12 150
    c 6 200
    d 3 90
    e 1 65
    ...........

    I would like to create new variables corresponding to exact lower and upper 95% CI for the proportion (freq/grouptotal) for each row (i.e. by groupname).

    In Stata 12, I was able to create this in a separate file by combining statsby with the immediate command cii

    statsby upper=r(ub) lower=r(lb), by(groupname) saving ('filename', replace): cii grouptotal freq

    In Stata 15, I have updated the syntax to 'cii proportions' but it doesn't work

    statsby upper=r(ub) lower=r(lb), by(groupname) saving (filename, replace): cii proportions grouptotal freq

    variable found where a number expected; perhaps you meant to use ci
    an error occurred when statsby executed cii


    It doesn't work with ci either.
    In the help menus it says only ci works with statsby.

    Please can anyone advise on a solution or an alternative method to do this?

    Many thanks
    Lorna

  • #2
    Lorna, take a look at the -xcipoibin- command here https://github.com/anddis/xcipoibin (disclaimer: I'm the author)

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 groupname byte freq int grouptotal
    "a"  5 100
    "b" 12 150
    "c"  6 200
    "d"  3  90
    "e"  1  65
    end
    
    
    . xcipoibin freq grouptotal , binomial
    
    . list
    
         +----------------------------------------------------------------+
         | groupn~e   freq   groupt~l   _pointe~e    _lowerCI    _upperCI |
         |----------------------------------------------------------------|
      1. |        a      5        100         .05   .01643188   .11283491 |
      2. |        b     12        150         .08   .04202026   .13557389 |
      3. |        c      6        200         .03   .01108746   .06415063 |
      4. |        d      3         90   .03333333    .0069276   .09433609 |
      5. |        e      1         65   .01538462   .00038943   .08276309 |
         +----------------------------------------------------------------+

    Comment


    • #3
      This also yields to a loop:

      Code:
      clear 
      input str1 groupname    freq    grouptotal
      a    5    100
      b    12    150
      c    6    200
      d    3    90
      e    1    65
      end 
      
      gen p = . 
      gen l = . 
      gen u = . 
      
      quietly forval i = 1/`=_N' { 
          cii proportions `= grouptotal[`i']' `=freq[`i']' 
          replace p = r(proportion) in `i' 
          replace u = r(ub) in `i' 
          replace l = r(lb) in `i' 
      } 
      
      list 
      
          +-------------------------------------------------------------+
           | groupn~e   freq   groupt~l          p          l          u |
           |-------------------------------------------------------------|
        1. |        a      5        100        .05   .0164319   .1128349 |
        2. |        b     12        150        .08   .0420203   .1355739 |
        3. |        c      6        200        .03   .0110875   .0641506 |
        4. |        d      3         90   .0333333   .0069276   .0943361 |
        5. |        e      1         65   .0153846   .0003894   .0827631 |
           +-------------------------------------------------------------+
      See also FAQ with related ideas:

      FAQ . . . . . . . . . . . . . Accumulating results from immediate commands
      . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
      10/15 How do I accumulate the results of immediate commands?

      https://www.stata.com/support/faqs/d...iate-commands/

      Comment


      • #4
        Alternatively, use version control to implement your old code that worked.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str1 group byte freq int grouptotal
        "a"  5 100
        "b" 12 150
        "c"  6 200
        "d"  3  90
        "e"  1  65
        end
        
        version 12: statsby N=r(N) p=r(mean) lower=r(lb) upper=r(ub), ///
        by(group) clear: cii grouptotal freq
        generate freq = N*p
        list group N freq p-upper
        OUTPUT:
        Code:
        . list group N freq p-upper
        
             +-----------------------------------------------------+
             | group     N   freq          p      lower      upper |
             |-----------------------------------------------------|
          1. |     a   100      5        .05   .0164319   .1128349 |
          2. |     b   150     12        .08   .0420203   .1355739 |
          3. |     c   200      6        .03   .0110875   .0641506 |
          4. |     d    90      3   .0333333   .0069276   .0943361 |
          5. |     e    65      1   .0153846   .0003894   .0827631 |
             +-----------------------------------------------------+

        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          Hello Andrea, Nick and Bruce

          Please can you provide code that works if one is looking after 90% CI or 97.5% CI for example.

          Thanks

          Comment


          • #6
            That's just a matter of modifying the cii call with a level() option.

            Comment


            • #7
              Thanks Nick

              Comment


              • #8
                Dear all
                Sorry for the delay in my response. Thank you very much for your answers. All three suggestions worked perfectly and will be extremely helpful for me moving forward!
                Lorna

                Comment


                • #9
                  Hello everyone,

                  I ran into a similar problem. I am new to forloops, so please forgive me if I made any obvious mistakes.
                  Basically, I want to do the same as Lorna in the example above, but then with two group variables.
                  I calculated my confidence intervals with:
                  Code:
                  by treatment: ci proportions ly0-ly19
                  .
                  Both the treatment and the lossyear variables (wide format) are binary variables.
                  I tried creating the new variables with
                  Code:
                   statsby, by(treatment): ci proportions ly0-ly19
                  , however I did not manage to create the same values as with the ci command.
                  Including two variables into by() lead to wrong results, with the proportion se and lb being 0.

                  Next I tried Nick's suggestion, however I was struggling with implementing it for the different years (lossyear 0- 19 ).
                  Code:
                  . quietly forval i = 1/`=_N' {
                       by treatment: ci proportions `= ly0 [`i']' `= ly1 [`i']' `= ly2 [`i']' `= ly3 [`i']' `= ly4 [`i']' `= ly5 [`i']' `= ly6 [`i']' `= ly7
                  >  [`i']' `= ly8 [`i']' `= ly9 [`i']' `= ly10 [`i']' `= ly11 [`i']' `= ly12 [`i']' `= ly13 [`i']' `= ly14 [`i']' `= ly15 [`i']' `= ly1
                  > 6 [`i']' `= ly17 [`i']' `= ly18 [`i']'`= ly19 [`i']'
                       replace p = r(proportion) in `i'
                       replace u = r(ub) in `i'
                       replace l = r(lb) in `i'
                   }
                  Here I ran into the error:
                  number found where a variable expected; perhaps you meant to use cii
                  r(198);
                  .

                  I tried around for quite a while now, but do not manage to get it to work.
                  My data is now formatted in wide format.
                  My final objective is to have a line graph with the proportions of both treatment groups over the different years with the confidence intervals plotted around.
                  Hope someone can help me.
                  Thanks!

                  Best,
                  David

                  Comment


                  • #10
                    #9 I don't think you'll get statsby working like that. You want a data structure that you aren't asking for and that statsby can't provide. That's the stuff of small nightmares, except that a reshape can get you where I think you want to be.

                    My solution in #3 was aimed at a quite different problem.

                    Here is a token example with 10 observations, 4 binary outcomes and 2 treatments. It should generalise to larger numbers of any of those. Use your own variable names where different.

                    Code:
                    clear
                    set obs 10
                    set seed 2803 
                    
                    forval j = 1/4 {
                        gen y`j' = runiform() > `j' * 0.2
                    }
                    
                    gen group = _n > 5 
                    
                    list, sepby(group)
                    
                    * start about here
                    * if you have an identifier, use it instead 
                    
                    gen id = _n 
                    reshape long y, i(id) j(which)
                    
                    statsby , by(group which) clear : ci proportions y   
                    
                    sort which group
                    
                    list , sepby(which)
                    
                         +------------------------------------------------------------------------------+
                         | group   which   mean   N   propor~n         se         lb         ub   level |
                         |------------------------------------------------------------------------------|
                      1. |     0       1     .8   5         .8   .1788854   .2835821   .9949492      95 |
                      2. |     1       1     .6   5         .6    .219089   .1466328   .9472551      95 |
                         |------------------------------------------------------------------------------|
                      3. |     0       2     .4   5         .4    .219089    .052745   .8533672      95 |
                      4. |     1       2     .2   5         .2   .1788854   .0050508   .7164179      95 |
                         |------------------------------------------------------------------------------|
                      5. |     0       3     .6   5         .6    .219089   .1466328   .9472551      95 |
                      6. |     1       3      0   5          0          0          0   .5218238      95 |
                         |------------------------------------------------------------------------------|
                      7. |     0       4     .2   5         .2   .1788854   .0050508   .7164179      95 |
                      8. |     1       4     .4   5         .4    .219089    .052745   .8533672      95 |
                         +------------------------------------------------------------------------------+

                    Comment

                    Working...
                    X