AUC response feature

Anders Grejs

Join Date: Dec 2015

Posts: 13
#1

AUC response feature

28 Dec 2015, 06:12

Dear Forum

I am having problems choosing the right method for calculating the AUC for my repetitive blood samples in two intervention groups and afterwards testing the difference between the two interventions.

I have a dataset with 160 records.
These patients have undergone intervention A or B.
I have blood sample Results from "admission to hospital", "from reaching their target intervention in the ward", after 6, 12, 18, 24, 30, 36, 42, 48, 72 and 96 hours after "reaching their target intervention in the ward" (of cause a few of the samples are accidentally missing). The blood samples have been drawn up to +/- 2 hours from protocolled time.
I have my SampleTime as double but I have also calculated timestamps with reference to "reaching their target intervention in the ward" which of cause results in some negative time values.

I have tried the PK functions (pkexamine and pksumm) but keep on getting errors like:
-time variables takes on repeated values
-.....fit() is greater than the number of obs. with __000002==6 & __000001
insufficient observations

and I have tried:
integ Result(interventionA) time
integ Result(interventionB) time

twoway (line Result(interventionA) time)(line Result(interventionB) time)(bar ub(interventionA) lb(interventionA) time, black)(bar ub(interventionB) lb(interventionB) time, black), ///
title("xxxx")

This solution gives me a negative AUC in interventionA (as I am using the calculated timestamp as explained in the top)

and I have tried:
sort record_id time
generate area=(Result+Result[_n+1])*(tid1[_n+1]-tid1)/2 if record_id==record_id[_n+1]
collapse (sum) area = area (first) intervention=intervention, by(record_id)
ttest area, by (intervention)

The two last ones run fine but I am in doubt if either of these are the right method???

Could someone help me step by step.
In advance thx a lot.

BR
Anders
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

28 Dec 2015, 07:02

Hello Anders,

Welcome to the Stata Forum. There has been some discussion on a similar topic here:http://www.statalist.org/forums/foru...eated-measures

Hopefully that helps.

Best,

Marcos

Best regards,

Marcos
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#3

28 Dec 2015, 08:48

Even though you're using a time variable that takes on negative values, the result of -integ- should still come out positive if the variable Result itself is always positive and if the data are sorted correctly. Also, as you have several different patients, you need to do your integration -by- patient id. Try this:

Code:

by patient_id time, sort: integ result time, gen(auc)

You should get correct results for each patient with that, assuming that intervention (A vs B) is a separate variable and that there is a single Result variable used to hold the results for each group. (If you have a ResultA variable and a ResultB variable, one of them always being missing for any patient due to group assignment, then you need to do this separately for each--but that's not a very good layout for data analysis anyway, and I would advise you to -reshape long- instead.)
Comment

Roman Mostazir

Join Date: Apr 2014
Posts: 872

29 Dec 2015, 13:03

Hi Andres,

As Clyde suggested, you need integral function to achieve AUC. Please note this concept is different than the discussion thread refferred by Marcos (Reple#2). In fact the terminologies confuse many who are not in this field and no doubt Marcos is just being victimized of confusing jargons. However, I will quickly show how you can achieve this with your effort using 'pkexamine'. Note Clyde's code will do the same except, I show you with trapezoidal rule (reason explained below) while Clyde's one is with cubic spline. But I think Clyde's code needs correction in sorting. I don't think it should be sorted by both id and time. I think it should only be sorted only with id as you essentially want AUC for each id as a fuction of time.

I am not sure if you have baseline and follow up meassure, but I will elaborate so that you can fit in your purpose.

Senario: A hormone was meassured between two groups (n=58) Intervention and TAU (Treatment as usual) at baseline (before intervention) and post-intervention at 6-month and 12-months. Hormonal response was measured at half an hour interval at each visit (0m-30m-60m-90m-120m). The hypothetical data looks like below:

Code:

//Hypothetical data:

 list id time rand base six twelve in 163/198,clean

       id   time           rand    base     six   twelve  
169.   29      0   Intervention   549.5     866    631.5  
171.   29     30   Intervention     750    1010      882  
172.   29     60   Intervention    1040    1090      849  
173.   29     90   Intervention    1200    1030     1040  
174.   29    120   Intervention    1430    1070      809  
175.   30      0   Intervention   248.5       .        .  
177.   30     30   Intervention     341       .        .  
178.   30     60   Intervention     359       .        .  
179.   30     90   Intervention     374       .        .  
180.   30    120   Intervention     433       .        .  
181.   31      0            TAU     248   370.5      287  
183.   31     30            TAU     421     655      428  
184.   31     60            TAU     511     673      499  
185.   31     90            TAU     616     624      431  
186.   31    120            TAU     638     754      430  

      
===========================xx====================

// Calculate AUC with 'pkexamine' and save as variable :

foreach var of varlist base six twelve {
    gen `var'auc=.
        forval i=1/58 {
            capture pkexamine time `var' if id==`i', t  //trapezoidal rule
            capture replace `var'auc=(`r(auc)') if id==`i'
     }
}

Now the dataset has three variables with total AUC for each participant at each visit. Note 'capture' is used because Stata stops calculation if there is missing values, therefore, 'capture' forces the ongoing calculation. Also please note that the default for AUC calculation is cubic spline whereas this was done with 'trapezodal rule'. The literature in this field suggests if you do not have good number of data points (as here in this case), trapezoidal rule is less biased than cubic spline. You need to check in your field which is preferred. If you don't use 't', cubic spline by default will be used. Now the dataset looks like this with the new variable having total AUC for each patient:

Code:

       id   time           rand    base    baseauc     six     sixauc   time   twelve   twelveauc  
169.   29      0   Intervention   549.5   120326.3     866     120450      0    631.5   103781.3  
171.   29     30   Intervention     750   120326.3    1010     120450     30      882   103781.3  
172.   29     60   Intervention    1040   120326.3    1090     120450     60      849   103781.3  
173.   29     90   Intervention    1200   120326.3    1030     120450     90     1040   103781.3  
174.   29    120   Intervention    1430   120326.3    1070     120450    120      809   103781.3  
181.   31      0            TAU     248    60202.5   370.5   76151.25      0      287    51247.5  
183.   31     30            TAU     421    60202.5     655   76151.25     30      428    51247.5  
184.   31     60            TAU     511    60202.5     673   76151.25     60      499    51247.5  
185.   31     90            TAU     616    60202.5     624   76151.25     90      431    51247.5  
186.   31    120            TAU     638    60202.5     754   76151.25    120      430    51247.5

You can see the overall AUC for each patient is repeated as we only saved the overall AUC for each visit. You basically need one row per id and plot them. See below:

Code:

egen pickone = tag(id) //pice one row per id
drop if pickone!=1
drop pickone base six twelve time  // we don't need them for plotting
save test2.dta,replace // save the dataset with AUC only

use test2.dta,clear

ren (baseauc sixauc twelveauc) (auc1 auc6 auc12)
reshape long auc, i(id) j(time) //reshape in long format for plotting

//Plot them:

lgraph auc time rand, ///
xlab(1 "Baseline" 6 "6-month" 12 "12-month") ///
ytitle("Mean AUC") ///
xtitle("Visit")

1 Photo

Roman

Comment

Anders Grejs

Join Date: Dec 2015

Posts: 13
#5

30 Dec 2015, 07:03

Thank you all for the very warm welcome to this forum. All of your comments have been useful but I have not yet reached my final goal.

Clyde Schechter : My data are LONG format.
I have attached an anonymised data example with invented data but the true variables to visualise. (Grp is the intervention)

However a modified command of the provided code in #2:

by record_id, sort: integ Result time, gen(auc)

was succesfull (I left out sorting for time as proposed by Roman Mostazir).
This time I did´t get any negative AUC´s.

Could this be modified further to fit the "trapezoidal rule" as proposed by *Roman and could you help me graph the data as the major problem is that the SampleTime/time are not always exactly the same between each record_id (drawing blood samples had to fit the "rounds", so the time can vary for up to 4 hours).
If we should be very strict the protocolled sampling should have been "Arrival, 6, 12, 18, 24, 30, 36, 42, 48, 72 and 96 hours. In addition some patients unfortunately "leave" the study before all blood samples have been collected.

Roman Mostazir :
I tried to reshape to wide format but unfortunately the AUC was only generated as missing.
I have written down my exact command (inclusive the reshaping)
///reshaping
sort record_id time Result
by record_id: gen dxno = _n
drop SampleTime T0
order record_id dxno time Grp Result
reshape wide Result time, i(record_id) j(dxno)
order record_id Grp Result*
list record_id Grp Result*, clean

///pkeamine
foreach var of varlist Result* {
gen `var'auc=.
forvalues i = 1/150 {
capture pkexamine time `var' if record_id==`i', t
capture replace `var'auc=(`r(auc)') if record_id==`i'
}
}

The graph provided was exactly what I would like as well as the test for equivalent AUC in the two intervention groups.

Again thanks a lot for all your help.

Best regards
Anders

1 Photo
Comment
Roman Mostazir

Join Date: Apr 2014

Posts: 872
#6

30 Dec 2015, 13:24

Originally posted by Anders Grejs View Post

by record_id, sort: integ Result time, gen(auc)

was succesfull (I left out sorting for time as proposed by Roman Mostazir).

Correction: you do need 'time' to be sorted but not with the 'by' option. Sort 'time' outside the integral function. See the example below.

Originally posted by Anders Grejs View Post

Could this be modified further to fit the "trapezoidal rule" as proposed by *Roman

Correction !! I did not suggest trapezoidal rule. I just warned you about existing literature which are suggestive of bias of cubic splines or other methods compared to the trapezoidal reule when you have few measurements per patient and showed you the alternative as Clyde already showed you one method. You need to search for literature in your field to find out what is suggestive for you to implement. However, you can just use the 't' option after Clyde's codes to get the trapezoidal rule implemented. Example given.

Originally posted by Anders Grejs View Post

If we should be very strict the protocolled sampling should have been "Arrival, 6, 12, 18, 24, 30, 36, 42, 48, 72 and 96 hours. In addition some patients unfortunately "leave" the study before all blood samples have been collected.

Again you need to do some existing literature search to see how this is handled. One common practice is averaging initial two time points so that everyone starts from the same point. There are other methods as well. Check out your field of literature. On the other issue, if samples are dropped out at a certain meassure, by sorting the time before calculating the AUC, you get the maximum AUC available for that person. Now it is up to the research question, and statistical methods on how you want to handle the missing points to avoid bias in your estimation. Your study protocol should tell you how to handle missing values. But it is not possible to cover up all suggestions here. If you are doing an ITT analysis, comparing two groups is just fine as they are.

Originally posted by Anders Grejs View Post

The graph provided was exactly what I would like as well as the test for equivalent AUC in the two intervention groups.

Example:

Code:

Hypothetical data: list id group time result in 70/85, clean id group time result 70. 10 1 6 337 71. 11 1 0 585 72. 11 1 1 81 73. 11 1 2 54 74. 11 1 3 97 75. 11 1 4 284 76. 11 1 5 224 77. 11 1 6 356 78. 12 2 0 474 79. 12 2 1 351 80. 12 2 2 415 81. 12 2 3 146 82. 12 2 4 380 83. 12 2 5 525 84. 12 2 6 196 85. 13 2 0 107 ==================xxxxxxxxxx================= // Calculate AUC applying both rules: sort id time //sort time outside by id , sort: integ result time, gen(auc) t //trapezoidal by id , sort: integ result time, gen(auc2) //cubic spline That looks like: id group time result auc auc2 70. 10 1 6 337 1591.5 1537.167 71. 11 1 0 585 0 0 72. 11 1 1 81 333 289.4614 73. 11 1 2 54 400.5 336.1545 74. 11 1 3 97 476 401.6707 75. 11 1 4 284 666.5 599.4128 76. 11 1 5 224 920.5 860.178 77. 11 1 6 356 1210.5 1129.625 78. 12 2 0 474 0 0 79. 12 2 1 351 412.5 390.5495 80. 12 2 2 415 795.5 789.8019 81. 12 2 3 146 1076 1063.743 82. 12 2 4 380 1339 1294.227 83. 12 2 5 525 1791.5 1779.849 84. 12 2 6 196 2152 2181.125 85. 13 2 0 107 0 0 keep if time==6 //you only need the total AUC for each individual and can drop the other measures unless you have other purpose i.e. measuring AUC at different time (say at 3 or 4) for between group individuals. Better to save as a different dataset before dropping. // Plot: lgraph auc group, xlab(1 "Group-1" 2 "Group-2") ytitle("Mean AUC") /// xtitle("Group")

By the way, could you please have a go to read the FAQ section on how to properly post in this forum using the Code blocks and how best to provide data example. That only assures you get a good response. Your provided dataset example is hardly legible and code examples could have been better.

1 Photo

Last edited by Roman Mostazir; 30 Dec 2015, 13:27.

Roman
Comment

Anders Grejs

Join Date: Dec 2015
Posts: 13

02 Jan 2016, 01:41

Hi Marcos, Clyde and Roman

Thank you all for your quick and detailed replies.

As my protocol is "Intention to treat" and the previous written literature in the field has been based on the Trapezoidal rule my final solution was:

Code:

keep id Grp time Result
// Calculate AUC applying the trapezoidal rule:

sort id time //sort time outside the integ

by id , sort: integ Result time, gen(AUC) t //trapezoidal rule

lgraph AUC time Grp //just to se the plot with all AUC(which though did´t make that much sense)

by id: egen max_AUC=max(AUC) 
keep if AUC==max_AUC //dropping all but the maximum AUC

ttest AUC, by(Grp)

//Plot
lgraph AUC Grp, xlab(1 "Grp 1" 2 "Grp 2") ytitle("Mean AUC")  ///
xtitle("Grp")

I must say that I admire you all for spending so much time on sharing your expertise

Happy New Year
Anders

Announcement