Counting Time to event

Niki Minckas

Join Date: Jul 2019

Posts: 12
#1

Counting Time to event

19 Jul 2019, 14:33

Dear all, I could really use your help on this one!
I have a dataset that has data of 2000 babies (baby_id). Each baby was followed from birth for a month, twice a day- so each baby should have around 60 time points of data (time). Although many of them don't. At each of those time points, there is a variables that is "exclusive breastfeeding" (ex_bf). I need to calculate the time to exclusive breastfeeding, meaning the moment in time that the baby passes from not being exclusively breastfed (0), to being exclusively breastfed(1).

The data is stored in the long format.

id time ex_bf
1 0 .
1 1 0
1 4 1
2 2 0
3 1 0
3 5 0
3 7 .
4 0 0
4 1 0
4 2 1
4 3 1
5 2 .
6 0 0
6 1 8
6 2 9

I've been trying to treat time categorically and say that ex_bf[_n-1]==0, and so on [_n-2] for all the previous times, but I gave up because there has to be a better way to go around it.

I would really appreciate your input since I need to submit this analysis ASAP.

Thank you!
Tags: None
Jordan Sydenham

Join Date: Jul 2019

Posts: 32
#2

19 Jul 2019, 15:17

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.
Without looking at the data I can tell you that if you have a time variable the user made program tsspell should prove helpful.
Try using the following code;
ssc install tsspell
help tsspel
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35438

20 Jul 2019, 03:05

I understand this as wanting the first date for each baby breastfeeding. See for guidance here

https://www.stata.com/support/faqs/d...t-occurrences/

https://www.stata.com/support/faqs/d...ing-last-date/

https://www.stata-journal.com/articl...article=dm0055

Here's some technique.

Code:

clear 
input baby_id time ex_bf
1 0 .
1 1 0
1 4 1
2 2 0
3 1 0
3 5 0
3 7 .
4 0 0
4 1 0
4 2 1
4 3 1
5 2 .
6 0 0
6 1 8
6 2 9
end 

egen first = min(cond(ex_bf == 1, time, .)), by(baby_id)

list, sepby(baby_id) 

     +--------------------------------+
     | baby_id   time   ex_bf   first |
     |--------------------------------|
  1. |       1      0       .       4 |
  2. |       1      1       0       4 |
  3. |       1      4       1       4 |
     |--------------------------------|
  4. |       2      2       0       . |
     |--------------------------------|
  5. |       3      1       0       . |
  6. |       3      5       0       . |
  7. |       3      7       .       . |
     |--------------------------------|
  8. |       4      0       0       2 |
  9. |       4      1       0       2 |
 10. |       4      2       1       2 |
 11. |       4      3       1       2 |
     |--------------------------------|
 12. |       5      2       .       . |
     |--------------------------------|
 13. |       6      0       0       . |
 14. |       6      1       8       . |
 15. |       6      2       9       . |
     +--------------------------------+

Comment

Niki Minckas

Join Date: Jul 2019

Posts: 12
#4

21 Jul 2019, 03:29

Thank you so much, the function in the egen command worked great!

I am now in a different puzzle, because I need to count the proportion of baby_id with ex_bf=1 over all the babies with data in ex_bf by time.
The first time categories (time=0/1/2/3) in which each baby_id has only one observation is ok. However, the last category is time>=4, I am doing the following:

egen tag= tag (baby_id)
tab ex_bf if time>=4 & tag==1

but it's bringing me very low numbers that don't make much sense. It is mostly the denominator that is very low.

Thanks again!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35438
#5

21 Jul 2019, 04:26

tag() won't do what you want here, It's for selecting one observation out of all in a group when all have the same value on a key variable and so it is enough to see one. If you look at a sample of data you;ll see that the tagged observations aren't necessarily those you want tabulated.

I don't understand what you do want tabulated: If it is just the times of first breast feeding, consider

Code:

tab time if time == first

or more simply

Code:

tab first if tag
Comment
Niki Minckas

Join Date: Jul 2019

Posts: 12
#6

21 Jul 2019, 06:07

Sorry, my explanation was very poor. Let me try again:

I need to count the frequency of babies that initiated exclusive breastfeeding at each time (e.g. number of babies with exclusive breastfeeding at time 1/ total number of babies with data on variable ex_bf at time 1).

Time to being fully breastfed, n (%)

< 12 h (time 0)

12-24 h (time 1)

24-48 h (time 2

48-72 h (time 3)

>= 72 h (>=time4)

The code that you gave me initially worked perfectly to find the numerator. However, the denominator has to be all the babies that have data on ex_bf at each time (either 0 or 1). In the first four categories (time 0 to time 4), it is easy because each baby has only one observation. But in the last category >=4 is I just do:
tab ex_bf if time>=4
It brings observations rather than babies - and because I am aggregating by time, I have more than one observation per baby. That is why, I was trying to use tag(), but you made a good point and exc_bf is not constant within baby_id.
Therefore, my question is: how can I count only one observation of exc_bf per baby_id?

Thank you very much!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35438
#7

22 Jul 2019, 02:50

Don't throw more words at the thread. Extend your data example and show what variables you want to see in the results. Then it will be easier to show code to get them. Use CODE delimiters to show your code and data as in #3 and #5.
Comment

Time to being fully breastfed, n (%)
< 12 h (time 0)
12-24 h (time 1)
24-48 h (time 2
48-72 h (time 3)
>= 72 h (>=time4)

Announcement

Counting Time to event

Comment

Comment

Comment

Comment

Comment

Comment