Fitting a curve

pierre martin

Join Date: Nov 2017

Posts: 63
#1

Fitting a curve

11 Apr 2018, 06:43

Hi everyone,

I am sorry such a basic question but I am studying visit to visit blood pressure variability (measures of variability for a set of n BP measurements over time x1, x2 ..., xn) and I would like to calculate a variation independent of mean (VIM). To do that, I am supposed to fit a curve of the form y=kx^pthrough a plot of SD systolic blood pressure (y-axis) against mean systolic blood pressure (x-axis), for all individuals in my cohort.

The parameter p is estimated from the data and k is a constant which can be chosen such that the values of VIM are on the same scale as values of SD. For example, if M is the average value of mean systolic blood pressure in the cohort, then k = M^pand the value of VIM for any individual is given by VIM systolic blood pressure = (k SD / mean(x)^p).

I first did a scatterplot of SD systolic blood pressure (y-axis) against mean systolic blood pressure (x-axis), for all individuals in my cohort. But I have some trouble to fit the curve as mentioned before.

Would you have any recommendations? I tried to use the - twoway qfit - command but I am not sure that it is what I need...How could I find the coefficients k and p ?

Thank you very much in advance for your help. I have no statistical background and I would like to be sure of the result.

Best,

Pierre, MD, PhD
Tags: None
Rich Dunbar

Join Date: Dec 2018

Posts: 6
#2

01 May 2019, 11:28

Did you ever find a solution to this? If so, please share. I am also trying to calculate VIM for another application. The missing value is p, which is solved by y=kx^p. I believe this is the equation that is solved by nonlinear techniques, where y = SD SBP and x is mean SBP. Substituting k=M^p would give y=M^p x^pIf you haven't already tried it, you could use the nl command to solve for p.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35697

01 May 2019, 12:06

I guess #1 never got answered because there is no data example. Also, it is hard to hack a way through. First x is time -- it seems -- but then x is the mean systolic blood pressure. Then k in the first equation should just be what is predicted for y when x = 1, but somehow it is also (or instead?) a power of the overall mean. I bailed out. I am confused even if Pierre knows what he wants.

But if SD is a power function of mean, then take logs of both variables and try a regression.

Here are some quite different data for various big rivers: their mainstream lengths and their basin areas. A power function length = A area^b seems plausible, hence log length = log A + b log area = a + b log area.

I would not use nonlinear least squares as variability isn't constant. Whether that is true for blood pressure, I don't know, but I wouldn't be surprised. For rivers the power b is nicely close to 0.5, which a dimensional analysis would suggest. (The facts that rivers wiggle and don't start on the basin perimeter and that basins are irregular shapes seemingly are secondary.)

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int length float area
6299   6150
2620    309
4416   1855
 880   51.8
2840    610
1400    114
 680    131
1200    160
1400    880
2333    640
1450    100
1950    670
 662   60.9
 360   61.8
2860    815
 518   22.9
2200    504
1350   72.1
1870    422
 930   86.8
1110    148
 744   64.4
1110    220
2510    980
 650     86
1500    287
 650   50.8
1726    360
3180    960
2300    410
 872    238
 600   37.8
1151   75.8
3513    647
1290    256
1360    188
1080    116
4400   2430
1350    170
1600    440
1110    120
4240   1448
1530    260
   .     75
 858    133
4500    810
 925     29
5985   3344
1064     57
3490    910
4160 1112.7
6670   2715
5570   2500
 909    112
1860    102
   .     46
2740    945
4500   2600
1810    322
 691     75
1200    120
1360    225
 810     99
1000     65
2870    670
 960    125
 729    130
1400    178
 610     73
3060    325
 560   80.1
 860    135
2800    640
 780   78.6
1430    441
 825     81
   .    350
2760   1050
3060   1185
 454   50.3
 733   72.5
2210    219
 720     91
 623   43.2
2430    237
   .    240
1014    198
3350   1350
1600    394
 724     46
2129    464
5520   1940
4670    980
5550   2580
3000    855
4370   3700
2660   1400
end

. describe length area

              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------------------------------
length          int     %8.0g                 mainstream length, km
area            float   %9.0g                 area, 000 sq. km

. gen log_l = log(length)
(4 missing values generated)

. gen log_a = log(area)

. regress log_l log_a

      Source |       SS           df       MS      Number of obs   =        93
-------------+----------------------------------   F(1, 91)        =    520.05
       Model |  40.0711785         1  40.0711785   Prob > F        =    0.0000
    Residual |  7.01180919        91  .077052848   R-squared       =    0.8511
-------------+----------------------------------   Adj R-squared   =    0.8494
       Total |  47.0829876        92  .511771605   Root MSE        =    .27758

------------------------------------------------------------------------------
       log_l |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       log_a |   .5045832   .0221264    22.80   0.000     .4606318    .5485346
       _cons |     4.5441   .1277142    35.58   0.000     4.290411    4.797788
------------------------------------------------------------------------------

. scatter log_l log_a || lfit log_l log_a

.

The proof of the pudding is in the eating, or rather the viewing:

Click image for larger version

Name: rivers.png
Views: 1
Size: 42.3 KB
ID: 1496126

Comment

Ronnie Celis Amaya

Join Date: Feb 2021

Posts: 5
#4

21 Feb 2021, 14:01

Hi everyone,

I'm Med Student and currently, I have a similar problem to Dr. Pierre. For a systolic blood pressure visit to visit (long term) variability analysis study, I must find the VIM. I first used the regression formula taking SD as the y-axis and mean pressure as the x-axis for all individuals in the cohort. then, ln(SD)= j+p*ln(mean X) was applied, where the parameters "j" (Beta 0) and "p" are estimated from the regression.
With the estimated parameters, the VIM was then calculated as follows:

I also used different ways to calculate the VIM, taking into account the following link https://stats.stackexchange.com/ques...f-the-mean-vim, but I am not sure if it is a correct calculation of this index.

I would like to know if I have done the procedure correctly, thank you very much for your help.
Comment
Ronnie Celis Amaya

Join Date: Feb 2021

Posts: 5
#5

21 Feb 2021, 14:04

. generate ln_sdsbp= ln(sdsbp)
(3 missing values generated)

. generate ln_meansbp= ln(meansbp)

. regress ln_sdsbp ln_meansbp

Source | SS df MS Number of obs = 129
> ,484
-------------+---------------------------------- F(1, 129482) = 265
> 8.10
Model | 348.931825 1 348.931825 Prob > F = 0.
> 0000
Residual | 16997.2223 129,482 .131270928 R-squared = 0.
> 0201
-------------+---------------------------------- Adj R-squared = 0.
> 0201
Total | 17346.1541 129,483 .133964722 Root MSE = .3
> 6231

--------------------------------------------------------------------------
> ----
ln_sdsbp | Coef. Std. Err. t P>|t| [95% Conf. Inter
> val]
-------------+------------------------------------------------------------
> ----
ln_meansbp | .7015822 .0136079 51.56 0.000 .6749108 .728
> 2535
_cons | -1.001897 .0660918 -15.16 0.000 -1.131435 -.872
> 3578
--------------------------------------------------------------------------
> ----

. generate VIM = 100*ln_sdsbp/(ln_meansbp^ .7015822)
(3 missing values generated)

. generate VIM_second = sdsbp/(-1.001897 * meansbp^.7015822)
(1 missing value generated)

. generate VIM_third = 100*sdsbp/(meansbp^.7015822)
(1 missing value generated)

. generate VIM_fourth = sdsbp/(meansbp^.7015822)
(1 missing value generated)
Comment

Announcement

Comment

Comment

Comment

Comment