statsby in combination with cii proportions command

Lorna Hazell

Join Date: Jan 2018

Posts: 2
#1

statsby in combination with cii proportions command

24 Jan 2018, 07:48

Hi there
I have a dataset in format with about 50 rows:

groupname freq grouptotal

a 5 100

b 12 150

c 6 200

d 3 90

e 1 65

...........

I would like to create new variables corresponding to exact lower and upper 95% CI for the proportion (freq/grouptotal) for each row (i.e. by groupname).

In Stata 12, I was able to create this in a separate file by combining statsby with the immediate command cii

statsby upper=r(ub) lower=r(lb), by(groupname) saving ('filename', replace): cii grouptotal freq

In Stata 15, I have updated the syntax to 'cii proportions' but it doesn't work

statsby upper=r(ub) lower=r(lb), by(groupname) saving (filename, replace): cii proportions grouptotal freq

variable found where a number expected; perhaps you meant to use ci
an error occurred when statsby executed cii

It doesn't work with ci either.
In the help menus it says only ci works with statsby.

Please can anyone advise on a solution or an alternative method to do this?

Many thanks
Lorna
Tags: None

Andrea Discacciati

Join Date: Feb 2016
Posts: 194

24 Jan 2018, 08:01

Lorna, take a look at the -xcipoibin- command here https://github.com/anddis/xcipoibin (disclaimer: I'm the author)

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str1 groupname byte freq int grouptotal
"a"  5 100
"b" 12 150
"c"  6 200
"d"  3  90
"e"  1  65
end


. xcipoibin freq grouptotal , binomial

. list

     +----------------------------------------------------------------+
     | groupn~e   freq   groupt~l   _pointe~e    _lowerCI    _upperCI |
     |----------------------------------------------------------------|
  1. |        a      5        100         .05   .01643188   .11283491 |
  2. |        b     12        150         .08   .04202026   .13557389 |
  3. |        c      6        200         .03   .01108746   .06415063 |
  4. |        d      3         90   .03333333    .0069276   .09433609 |
  5. |        e      1         65   .01538462   .00038943   .08276309 |
     +----------------------------------------------------------------+

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

24 Jan 2018, 10:54

This also yields to a loop:

Code:

clear 
input str1 groupname    freq    grouptotal
a    5    100
b    12    150
c    6    200
d    3    90
e    1    65
end 

gen p = . 
gen l = . 
gen u = . 

quietly forval i = 1/`=_N' { 
    cii proportions `= grouptotal[`i']' `=freq[`i']' 
    replace p = r(proportion) in `i' 
    replace u = r(ub) in `i' 
    replace l = r(lb) in `i' 
} 

list 

    +-------------------------------------------------------------+
     | groupn~e   freq   groupt~l          p          l          u |
     |-------------------------------------------------------------|
  1. |        a      5        100        .05   .0164319   .1128349 |
  2. |        b     12        150        .08   .0420203   .1355739 |
  3. |        c      6        200        .03   .0110875   .0641506 |
  4. |        d      3         90   .0333333   .0069276   .0943361 |
  5. |        e      1         65   .0153846   .0003894   .0827631 |
     +-------------------------------------------------------------+

See also FAQ with related ideas:

FAQ . . . . . . . . . . . . . Accumulating results from immediate commands
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
10/15 How do I accumulate the results of immediate commands?

https://www.stata.com/support/faqs/d...iate-commands/

Comment

Bruce Weaver

Join Date: May 2014
Posts: 1133

24 Jan 2018, 16:10

Alternatively, use version control to implement your old code that worked.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str1 group byte freq int grouptotal
"a"  5 100
"b" 12 150
"c"  6 200
"d"  3  90
"e"  1  65
end

version 12: statsby N=r(N) p=r(mean) lower=r(lb) upper=r(ub), ///
by(group) clear: cii grouptotal freq
generate freq = N*p
list group N freq p-upper

OUTPUT:

Code:

. list group N freq p-upper

     +-----------------------------------------------------+
     | group     N   freq          p      lower      upper |
     |-----------------------------------------------------|
  1. |     a   100      5        .05   .0164319   .1128349 |
  2. |     b   150     12        .08   .0420203   .1355739 |
  3. |     c   200      6        .03   .0110875   .0641506 |
  4. |     d    90      3   .0333333   .0069276   .0943361 |
  5. |     e    65      1   .0153846   .0003894   .0827631 |
     +-----------------------------------------------------+

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)

Comment

Madu Abuchi

Join Date: Sep 2017

Posts: 143
#5

09 Feb 2018, 04:57

Hello Andrea, Nick and Bruce

Please can you provide code that works if one is looking after 90% CI or 97.5% CI for example.

Thanks
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

09 Feb 2018, 05:08

That's just a matter of modifying the cii call with a level() option.
Comment
Madu Abuchi

Join Date: Sep 2017

Posts: 143
#7

09 Feb 2018, 05:16

Thanks Nick
Comment
Lorna Hazell

Join Date: Jan 2018

Posts: 2
#8

07 Jan 2019, 02:33

Dear all
Sorry for the delay in my response. Thank you very much for your answers. All three suggestions worked perfectly and will be extremely helpful for me moving forward!
Lorna
Comment
David Schneider

Join Date: Mar 2021

Posts: 10
#9

07 Apr 2021, 08:02

Hello everyone,

I ran into a similar problem. I am new to forloops, so please forgive me if I made any obvious mistakes.
Basically, I want to do the same as Lorna in the example above, but then with two group variables.
I calculated my confidence intervals with:

Code:

by treatment: ci proportions ly0-ly19

.
Both the treatment and the lossyear variables (wide format) are binary variables.
I tried creating the new variables with

Code:

statsby, by(treatment): ci proportions ly0-ly19

, however I did not manage to create the same values as with the ci command.
Including two variables into by() lead to wrong results, with the proportion se and lb being 0.

Next I tried Nick's suggestion, however I was struggling with implementing it for the different years (lossyear 0- 19 ).

Code:

. quietly forval i = 1/`=_N' { by treatment: ci proportions `= ly0 [`i']' `= ly1 [`i']' `= ly2 [`i']' `= ly3 [`i']' `= ly4 [`i']' `= ly5 [`i']' `= ly6 [`i']' `= ly7 > [`i']' `= ly8 [`i']' `= ly9 [`i']' `= ly10 [`i']' `= ly11 [`i']' `= ly12 [`i']' `= ly13 [`i']' `= ly14 [`i']' `= ly15 [`i']' `= ly1 > 6 [`i']' `= ly17 [`i']' `= ly18 [`i']'`= ly19 [`i']' replace p = r(proportion) in `i' replace u = r(ub) in `i' replace l = r(lb) in `i' }

Here I ran into the error:

number found where a variable expected; perhaps you meant to use cii
r(198);

.

I tried around for quite a while now, but do not manage to get it to work.
My data is now formatted in wide format.
My final objective is to have a line graph with the proportions of both treatment groups over the different years with the confidence intervals plotted around.
Hope someone can help me.
Thanks!

Best,
David
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

#10

07 Apr 2021, 08:31

#9 I don't think you'll get statsby working like that. You want a data structure that you aren't asking for and that statsby can't provide. That's the stuff of small nightmares, except that a reshape can get you where I think you want to be.

My solution in #3 was aimed at a quite different problem.

Here is a token example with 10 observations, 4 binary outcomes and 2 treatments. It should generalise to larger numbers of any of those. Use your own variable names where different.

Code:

clear
set obs 10
set seed 2803 

forval j = 1/4 {
    gen y`j' = runiform() > `j' * 0.2
}

gen group = _n > 5 

list, sepby(group)

* start about here
* if you have an identifier, use it instead 

gen id = _n 
reshape long y, i(id) j(which)

statsby , by(group which) clear : ci proportions y   

sort which group

list , sepby(which)

     +------------------------------------------------------------------------------+
     | group   which   mean   N   propor~n         se         lb         ub   level |
     |------------------------------------------------------------------------------|
  1. |     0       1     .8   5         .8   .1788854   .2835821   .9949492      95 |
  2. |     1       1     .6   5         .6    .219089   .1466328   .9472551      95 |
     |------------------------------------------------------------------------------|
  3. |     0       2     .4   5         .4    .219089    .052745   .8533672      95 |
  4. |     1       2     .2   5         .2   .1788854   .0050508   .7164179      95 |
     |------------------------------------------------------------------------------|
  5. |     0       3     .6   5         .6    .219089   .1466328   .9472551      95 |
  6. |     1       3      0   5          0          0          0   .5218238      95 |
     |------------------------------------------------------------------------------|
  7. |     0       4     .2   5         .2   .1788854   .0050508   .7164179      95 |
  8. |     1       4     .4   5         .4    .219089    .052745   .8533672      95 |
     +------------------------------------------------------------------------------+

Announcement