-predperf- available on SSC: simple accuracy and precision indicators for one or two continuous-value predictors

Thierry Buclin

Join Date: Nov 2021

Posts: 7
#1

-predperf- available on SSC: simple accuracy and precision indicators for one or two continuous-value predictors

25 Nov 2024, 09:18

A new package called predperf, for predictive performance, is available from SSC, thanks to Kit Baum. It aims at computing simple indicators for the accuracy and precision of a continuous-value predictive model, when a series of predictions is available along with corresponding 'true' observations. It can also compare the predictive performance of two distinct predictors.

Background: In many disciplines involving modeling, such as pharmacometrics, we regularly need to validate a model built to predict the values of a continuous variable, by comparing its predictions with 'true' observations, or to compare two competing models on the basis of their respective performances. A seminal paper by Lewis Sheiner and Stuart Beal, still widely cited today, laid the conceptual foundations for evaluating the predictive performance of a model for a continuous outcome variable (Sheiner LB, Beal SL. Some suggestions for measuring predictive performance. J Pharmacokinet Biopharm. 1981;9:503-12. doi: 10.1007/BF01060893). Let us just quote them:

The performance of a prediction or measurement method is often evaluated by computing the correlation coefficient and/or the regression of predictions on true (reference) values. These provide, however, only a poor description of predictive performance. The mean squared prediction error (precision) and the mean prediction error (bias) provide better descriptions of predictive performance. These quantities are easily computed, and can be used to compare prediction methods to absolute standards or to one another.

General idea: The accuracy of predictions is reflected in their mean bias relative to observations, while the precision of predictions is reflected in their root mean square residual error or RMSE, with respect to the observations. Sheiner and Beal showed how to estimate simple confidence intervals around these two indices, and how to test simple hypotheses. This unpretentious program implements these calculations in STATA. It expands Roy Wada's -rmse- module (2009). In addition, it provides an illustrative graph of the calculated indices over a scatterplot of observations versus predictions. It offers the option of calculating absolute bias and imprecision (respectively ME, RMSE), or their relative counterparts (i.e. expressed as percentage ratios: MPE, RMSPE), or their logarithmic counterparts (i.e. based on the logarithm of the values compared: MLE, RMSLE).

Syntax: Given a series of 'true' observations y along with one or two predictors a and b, the syntax of predperf is as follows:
for one predictor:

predperf y a [if] [in] [weight], [metrics(#)] [rel] [log] [level(cilevel)] [df(#)] [sig] [floor(lloq)] [nograph]
for two predictors:

predperf y a b [if] [in] [weight], [metrics(#)] [rel] [log] [level(cilevel)] [df(# #)] [sig] [floor(lloq)] [nograph]

The following options are available:
metrics(#) indicates the type of performance indices to compute; 1 = arithmetic (default), 2 = relative, 3 = logarithmic

rel is equivalent to metrics(2)and makes relative performance indicators be computed (MPE, RMSPE)

log is equivalent to metrics(3)and makes logarithmic performance indicators be computed (MLE, RMSLE)

level(cilevel) specifies a confidence level for indicators estimation different from the default value set in STATA (usually 95)

df(# [#]) specifies degrees of freedom to correct the computation of RMSE (or RMSPE or RMSLE); the default value of 0 is appropriate when the predictions have been obtained from a model developed independently, but should be set to the number of estimated parameters for a model based on y values.

sig displays significance levels associated to performance indices ; it tests for the departure of ME or MPE from 0 (or of MLE from 1),
and for the difference of RMSE, RMSPE, or RMSLE from the raw dispersion of observations (i.e. respectively SD, percentage SD, or logarithmic SD)

floor(#) adds a constant to observations and predictions for the computation of logarithmic performance indicators, usually the lower limit of quantification of the measurement method for the observations; default is 0

nograph omits the production of a graph illustrating the predictive performance

Example: A dataset containing the data used as example in Sheiner and Beal's article is provided as ancillary datafile with the -predperf- module. It contains a series of 20 'true' observations y along with corresponding values for two predictors a and b. Let us first install the module and download the example dataset:

Code:

. net install predperf.pkg checking predperf consistency and verifying not already installed... installing into C:\Users\...\ado\plus\... installation complete. . net get predperf.pkg checking predperf consistency and verifying not already installed... copying into current directory... copying sheinerbeal.dta ancillary files successfully copied. . use sheinerbealdata

We can first examine the predictive performance of predictor a compared to observations y, expressed as absolute bias (ME) and imprecision (RMSE). Both values come along with confidence intervals (estimated under the usual assumption of independence and normality):

Code:

. predperf y a Absolute predictive performance for the prediction of y (ME = bias = mean difference, RMSE = imprecision, both in units of y) Predictor | ME [95% conf. interval] | RMSE [95% conf. interval] -------------+---------------------------------+-------------------------------- a | .0275 -.3789085 .4339084 | .8468264 .3632941 1.141161 Number of observations: 20

A graph is produced, showing a scatterplot of the 'true' y values versus the values of predictor a. A green, long-dashed identity line is superimposed (function y = a), together with parallel lines at distances of ±1·RMSE and ±2·RMSE. A red, continuous line is added, at a distance of ME from the identity line (function y = a + ME).

We then want to evaluate the predictive performance of predictor a compared to observations y, expressed now as relative i.e. percentage-based bias (MPE) and imprecision (RMSPE). In addition, we want to test whether MPE differs significantly from zero and RMSPE from the raw CV of the set of observations, which appears not to be the case:

Code:

. predperf y a, rel sig Relative predictive performance for the prediction of y (MPE = relative bias, RMSPE = relative imprecision, both in percentage ratio) Predictor | MPE [95% conf. interval] | RMSPE [95% conf. interval] -------------+---------------------------------+-------------------------------- a | .0538046 -.0450894 .1526985 | .2128671 .1027779 .2829514 | P(|MPE|>0) = 0.2690 | P(|SDP - RMSPE|>0) = 0.1685 Number of observations: 20

A slightly different type of graph is produced to illustrate the relative indicators of predictive performance. It still shows the same scatterplot of 'true' y versus predictor a values, with a green, long-dashed identity line superimposed. But it is now flanked with lines arranged in spokes at relative distances of *1·RMSPE and *2·RMSPE from the identity line. A red, continuous line is drawn at a relative distance of MPE from the identity line (function y = a * (1 + MPE)).

Eventually, we may want to compare the predictive performances of predictors a and b versus observationsy, expressed as logarithmic bias (MLE) and imprecision (RMSLE), while testing whether they differ significantly. A lower limit of quantification of 0.1, set by the measurment method, is used for both predictors and the observations to replace zero or very small values, should any exist. We see that in logarithmic metrics, the imprecision of both predictors appears significicantly smaller than the raw geometric SD of the set of observations. In addition, the precision of a is significantly better than the precision of b. The mean biass of both predictors (expressed as geometric ratio of a/y and b/y) do not differ significantly from 1; neither do they differ between one another.

Code:

. predperf y a b, log sig floor(0.1) Relative predictive performance for the prediction of y (MLE = relative bias = geometric mean ratio, and RMSLE = relative imprecision) Predictor | MLE [95% conf. interval] | RMSLE [95% conf. interval] -------------+---------------------------------+-------------------------------- a | 1.032843 .9447445 1.129156 | .1884656 .1126916 .241535 | P(|MLE|>0) = 0.4574 | P(|SDL - RMSLE|>0) = 0.0507 b | .8721978 .6918375 1.099578 | .5014651 .2880287 .6480539 | P(|MLE|>0) = 0.2317 | P(|SDL - RMSLE|>0) = 0.0404 -------------+---------------------------------+-------------------------------- ratio | diff | .8444633 .6497418 1.097541 | .464702 .2275931 .6165203 | P(|MLE diff|) = 0.1929 | P(|RMSLE diff|) = 0.0126 (Note: no bias <=> MLE=1; difference of bias is expressed as ratio of MLE) Number of observations: 20

A third type of graph is produced, rather similar to the first type show above, except that it uses logarithmic scales for both the X- and Y-axes and log-transformed values for the line functions. As we compare now two predictors, a pair of graphs is displayed, sharing in common the 'true' y values along the Y-axis.

I hope this module finds its way to interested people who might find it useful. Questions and suggestions welcome.

Last edited by Thierry Buclin; 25 Nov 2024, 10:17.

Thierry Buclin, MD, clinical pharmacologist
Tags: bias, model validation, predictive performance, RMSE

2 likes
Thierry Buclin

Join Date: Nov 2021

Posts: 7
#2

25 Nov 2024, 13:38

-------

Thierry Buclin, MD, clinical pharmacologist
Comment

Announcement

-predperf- available on SSC: simple accuracy and precision indicators for one or two continuous-value predictors

Comment