Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fitting a curve

    Hi everyone,

    I am sorry such a basic question but I am studying visit to visit blood pressure variability (measures of variability for a set of n BP measurements over time x1, x2 ..., xn) and I would like to calculate a variation independent of mean (VIM). To do that, I am supposed to fit a curve of the form y=kxp through a plot of SD systolic blood pressure (y-axis) against mean systolic blood pressure (x-axis), for all individuals in my cohort.

    The parameter p is estimated from the data and k is a constant which can be chosen such that the values of VIM are on the same scale as values of SD. For example, if M is the average value of mean systolic blood pressure in the cohort, then k = Mp and the value of VIM for any individual is given by VIM systolic blood pressure = (k SD / mean(x)p).

    I first did a scatterplot of SD systolic blood pressure (y-axis) against mean systolic blood pressure (x-axis), for all individuals in my cohort. But I have some trouble to fit the curve as mentioned before.

    Would you have any recommendations? I tried to use the - twoway qfit - command but I am not sure that it is what I need...How could I find the coefficients k and p ?

    Thank you very much in advance for your help. I have no statistical background and I would like to be sure of the result.

    Best,

    Pierre, MD, PhD

  • #2
    Did you ever find a solution to this? If so, please share. I am also trying to calculate VIM for another application. The missing value is p, which is solved by y=kxp. I believe this is the equation that is solved by nonlinear techniques, where y = SD SBP and x is mean SBP. Substituting k=Mp would give y=Mp xp If you haven't already tried it, you could use the nl command to solve for p.

    Comment


    • #3
      I guess #1 never got answered because there is no data example. Also, it is hard to hack a way through. First x is time -- it seems -- but then x is the mean systolic blood pressure. Then k in the first equation should just be what is predicted for y when x = 1, but somehow it is also (or instead?) a power of the overall mean. I bailed out. I am confused even if Pierre knows what he wants.

      But if SD is a power function of mean, then take logs of both variables and try a regression.

      Here are some quite different data for various big rivers: their mainstream lengths and their basin areas. A power function length = A area^b seems plausible, hence log length = log A + b log area = a + b log area.

      I would not use nonlinear least squares as variability isn't constant. Whether that is true for blood pressure, I don't know, but I wouldn't be surprised. For rivers the power b is nicely close to 0.5, which a dimensional analysis would suggest. (The facts that rivers wiggle and don't start on the basin perimeter and that basins are irregular shapes seemingly are secondary.)

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int length float area
      6299   6150
      2620    309
      4416   1855
       880   51.8
      2840    610
      1400    114
       680    131
      1200    160
      1400    880
      2333    640
      1450    100
      1950    670
       662   60.9
       360   61.8
      2860    815
       518   22.9
      2200    504
      1350   72.1
      1870    422
       930   86.8
      1110    148
       744   64.4
      1110    220
      2510    980
       650     86
      1500    287
       650   50.8
      1726    360
      3180    960
      2300    410
       872    238
       600   37.8
      1151   75.8
      3513    647
      1290    256
      1360    188
      1080    116
      4400   2430
      1350    170
      1600    440
      1110    120
      4240   1448
      1530    260
         .     75
       858    133
      4500    810
       925     29
      5985   3344
      1064     57
      3490    910
      4160 1112.7
      6670   2715
      5570   2500
       909    112
      1860    102
         .     46
      2740    945
      4500   2600
      1810    322
       691     75
      1200    120
      1360    225
       810     99
      1000     65
      2870    670
       960    125
       729    130
      1400    178
       610     73
      3060    325
       560   80.1
       860    135
      2800    640
       780   78.6
      1430    441
       825     81
         .    350
      2760   1050
      3060   1185
       454   50.3
       733   72.5
      2210    219
       720     91
       623   43.2
      2430    237
         .    240
      1014    198
      3350   1350
      1600    394
       724     46
      2129    464
      5520   1940
      4670    980
      5550   2580
      3000    855
      4370   3700
      2660   1400
      end
      
      . describe length area
      
                    storage   display    value
      variable name   type    format     label      variable label
      --------------------------------------------------------------------------------------------------------------------------------------------------
      length          int     %8.0g                 mainstream length, km
      area            float   %9.0g                 area, 000 sq. km
      
      . gen log_l = log(length)
      (4 missing values generated)
      
      . gen log_a = log(area)
      
      . regress log_l log_a
      
            Source |       SS           df       MS      Number of obs   =        93
      -------------+----------------------------------   F(1, 91)        =    520.05
             Model |  40.0711785         1  40.0711785   Prob > F        =    0.0000
          Residual |  7.01180919        91  .077052848   R-squared       =    0.8511
      -------------+----------------------------------   Adj R-squared   =    0.8494
             Total |  47.0829876        92  .511771605   Root MSE        =    .27758
      
      ------------------------------------------------------------------------------
             log_l |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             log_a |   .5045832   .0221264    22.80   0.000     .4606318    .5485346
             _cons |     4.5441   .1277142    35.58   0.000     4.290411    4.797788
      ------------------------------------------------------------------------------
      
      . scatter log_l log_a || lfit log_l log_a
      
      .
      The proof of the pudding is in the eating, or rather the viewing:
      Click image for larger version

Name:	rivers.png
Views:	1
Size:	42.3 KB
ID:	1496126


      Comment


      • #4
        Hi everyone,

        I'm Med Student and currently, I have a similar problem to Dr. Pierre. For a systolic blood pressure visit to visit (long term) variability analysis study, I must find the VIM. I first used the regression formula taking SD as the y-axis and mean pressure as the x-axis for all individuals in the cohort. then, ln(SD)= j+p*ln(mean X) was applied, where the parameters "j" (Beta 0) and "p" are estimated from the regression.
        With the estimated parameters, the VIM was then calculated as follows:
        Click image for larger version

Name:	WhatsApp Image 2021-02-21 at 1.22.40 PM.jpeg
Views:	1
Size:	9.7 KB
ID:	1594355


        I also used different ways to calculate the VIM, taking into account the following link https://stats.stackexchange.com/ques...f-the-mean-vim, but I am not sure if it is a correct calculation of this index.

        I would like to know if I have done the procedure correctly, thank you very much for your help.

        Comment


        • #5
          . generate ln_sdsbp= ln(sdsbp)
          (3 missing values generated)

          . generate ln_meansbp= ln(meansbp)

          . regress ln_sdsbp ln_meansbp

          Source | SS df MS Number of obs = 129
          > ,484
          -------------+---------------------------------- F(1, 129482) = 265
          > 8.10
          Model | 348.931825 1 348.931825 Prob > F = 0.
          > 0000
          Residual | 16997.2223 129,482 .131270928 R-squared = 0.
          > 0201
          -------------+---------------------------------- Adj R-squared = 0.
          > 0201
          Total | 17346.1541 129,483 .133964722 Root MSE = .3
          > 6231

          --------------------------------------------------------------------------
          > ----
          ln_sdsbp | Coef. Std. Err. t P>|t| [95% Conf. Inter
          > val]
          -------------+------------------------------------------------------------
          > ----
          ln_meansbp | .7015822 .0136079 51.56 0.000 .6749108 .728
          > 2535
          _cons | -1.001897 .0660918 -15.16 0.000 -1.131435 -.872
          > 3578
          --------------------------------------------------------------------------
          > ----

          . generate VIM = 100*ln_sdsbp/(ln_meansbp^ .7015822)
          (3 missing values generated)

          . generate VIM_second = sdsbp/(-1.001897 * meansbp^.7015822)
          (1 missing value generated)

          . generate VIM_third = 100*sdsbp/(meansbp^.7015822)
          (1 missing value generated)

          . generate VIM_fourth = sdsbp/(meansbp^.7015822)
          (1 missing value generated)

          Comment

          Working...
          X