How to get prediction limits using margins command?

Keowmani Thamron

Join Date: Jul 2016

Posts: 26
#1

How to get prediction limits using margins command?

30 Nov 2016, 00:31

Hi all,

After regress y x, the command margins, at(x = a) will give me .95 confidence limits. That is fine if I want to estimate a mean response at x = a. But, if I want to get a prediction at x = a, what is the command?

p/s: Currently, I have to calculate the spred manually to get the prediction limits. But, it get tedious when I need to correct the prediction limits for multiple predictions.

Thank you in advance for any advice!
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4852
#2

30 Nov 2016, 08:57

It is not clear to me what you want. Perhaps you could post the commands and output for what you are getting and then explain what you want instead.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Keowmani Thamron

Join Date: Jul 2016

Posts: 26
#3

30 Nov 2016, 10:10

Attached is the output. The response is y and the predictor is x. I would like to get .95 prediction limits of y when x = 100. However, when using post-estimation command margins, at(x = 100), the result given is .95 confidence interval. I have to calculate the standard deviation of prediction s{pred} and manually to get the lower and upper limits of prediction. Can margins be used to get the prediction limits?

Thanks!

Attached Files

example.docx (38.5 KB, 1 view)
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4852
#4

30 Nov 2016, 10:28

It is better to post output using code tags; see pt 12 of the FAQ. People are understandably reluctant to open attachments (which could contain viruses) and may not have the needed software to read the attachment anyway.

Throwing caution to the wind, I did open the attachment. It says the adjusted prediction is 419.3861. It gives you the 95% CI. I still don't understand why you are not happy with that. It also isn't clear to me why you used some of the numbers you did, e.g. 2383.71562. Perhaps if you post your output using code tags, someone else will understand your problem better than I am.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Keowmani Thamron

Join Date: Jul 2016

Posts: 26
#5

30 Nov 2016, 11:14

regress y x

margins, at(x = 100)

The commands above will do if I want to estimate the mean response, Y_h, when x = 100. However, I don't want that. Instead, I want to predict a new observation, Y_h(new), when x = 100.

Now, the 1 - alpha confidence limits for Y_h is: Y_h +- t(1 - alpha/2; n - 2)*s{Y_h}. For alpha = .05, the .95 confidence limits are exactly as given by the above commands.

Next, the 1 - alpha prediction limits for Y_h(new) is: Y_h(new) +- t(1 - alpha/2; n -2)*s{pred}. Where s²{pred} = MSE + s²{Y_h}.

Can margins provide the latter?

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input int(x y) 80 399 30 121 50 221 90 376 70 361 60 224 120 546 80 352 100 353 50 157 40 160 70 252 90 389 20 113 110 435 100 420 30 212 50 268 90 377 110 421 30 273 90 468 40 244 80 342 70 323 end
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4852
#6

30 Nov 2016, 23:06

Is this closer to what you are thinking about? It doesn't make much sense with only one x, but it might make sense if there were more than one. You could use gen statements to compute the upper and lower bounds of the prediction for each case.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input int(x y) 80 399 30 121 50 221 90 376 70 361 60 224 120 546 80 352 100 353 50 157 40 160 70 252 90 389 20 113 110 435 100 420 30 212 50 268 90 377 110 421 30 273 90 468 40 244 80 342 70 323 end clonevar x2 = x reg y x2 replace x2 = 100 predict yhat predict stdp, stdp list

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Keowmani Thamron

Join Date: Jul 2016

Posts: 26
#7

01 Dec 2016, 00:12

predict stdp, stdp will give s{Y_h}, which is exactly the same as that given by the post-estimation command margins, at(x = 100). To get s{pred}, the command is predict spred, stdf.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4852
#8

01 Dec 2016, 07:26

OK. So does that give you what you want? I don't see how margins will give it to you.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1094
#9

01 Dec 2016, 10:28

Keowmani, thank you for raising this question. It was something I was also wondering about, but hadn't got around to asking.

For those who may not be clear on what the issue is, perhaps Appendix 2 in my linear regression slides will clarify things--scroll down to page 30 in this PDF.
Slide 177 shows a dialog box from the SPSS chart editor, with None, Mean and Individual as the options for a confidence interval on the regression line.

Slide 178 shows a scatter-plot (made with SPSS) showing both the mean and individual confidence intervals. (The "individual" CI is what Keowmani is calling the prediction interval).

Slide 179 shows a plot of the same data made via margins & marginsplot. That confidence interval pretty clearly matches the mean CI from SPSS.

The remainder of the slides in Appendix 2 show how the formulae for the mean and individual CIs differ. The only difference is that the formula for the individual CI (or prediction interval) has 1 added to the leverage under the square root sign. That is what makes the individual CI wider than the mean CI.

By the way, this tutorial shows that JMP also has easy methods for producing both the mean and individual CIs.

I think the main point of Keowmani's first post was that margins & marginsplot do not appear to have the same Individual CI (or prediction interval) option that SPSS has. That was my conclusion too (using v13.1).

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1094

#10

01 Dec 2016, 10:40

Following up on #6 and #7, here are some examples using both SPSS and Stata that I hope will clarify things.

Code:

* --- Start of SPSS example ----.

DATA LIST LIST / x y (2F5.0).
BEGIN DATA
 80 399
 30 121
 50 221
 90 376
 70 361
 60 224
120 546
 80 352
100 353
 50 157
 40 160
 70 252
 90 389
 20 113
110 435
100 420
 30 212
 50 268
 90 377
110 421
 30 273
 90 468
 40 244
 80 342
 70 323
END DATA.
REGRESSION
  /CRITERIA=PIN(.05) POUT(.10) CIN(95)
  /DEPENDENT y
  /METHOD=ENTER x
  /SAVE PRED SEPRED MCIN ICIN.
* REGRESSION output not shown .
LIST.

    x     y       PRE_1       SEP_1      LMCI_1      UMCI_1      LICI_1      UICI_1
 
   80   399   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
   30   121   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
   50   221   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
   90   376   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
   70   361   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
   60   224   276.57798    10.36280   255.14090   298.01506   173.32931   379.82665
  120   546   490.79010    19.90786   449.60756   531.97264   381.71792   599.86229
   80   352   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
  100   353   419.38606    14.27233   389.86150   448.91062   314.16040   524.61172
   50   157   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
   40   160   205.17394    14.27233   175.64938   234.69850    99.94828   310.39960
   70   252   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
   90   389   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
   20   113   133.76990    19.90786    92.58736   174.95244    24.69771   242.84208
  110   435   455.08808    16.96974   419.98350   490.19266   348.16254   562.01362
  100   420   419.38606    14.27233   389.86150   448.91062   314.16040   524.61172
   30   212   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
   50   268   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
   90   377   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
  110   421   455.08808    16.96974   419.98350   490.19266   348.16254   562.01362
   30   273   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
   90   468   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
   40   244   205.17394    14.27233   175.64938   234.69850    99.94828   310.39960
   80   342   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
   70   323   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
 
Number of cases read:  25    Number of cases listed:  25

* PRE_1  = Unstandardized Predicted Value
* SEP_1  = Standard Error of Predicted Value
* LMCI_1 = 95% L CI for y mean
* UMCI_1 = 95% U CI for y mean
* LICI_1 = 95% L CI for y individual
* UICI_1 = 95% U CI for y individual.

* ----- End of SPSS example --------.

And now a Stata example using the same data (with only the needed output included).

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(x y)
 80 399
 30 121
 50 221
 90 376
 70 361
 60 224
120 546
 80 352
100 353
 50 157
 40 160
 70 252
 90 389
 20 113
110 435
100 420
 30 212
 50 268
 90 377
110 421
 30 273
 90 468
 40 244
 80 342
 70 323
end

regress y x
matrix table = r(table)
scalar tcrit = table[8,1]
display "Critical t = " tcrit

predict yhat
predict stdp, stdp
generate lower = yhat - tcrit*stdp
generate upper = yhat + tcrit*stdp
list

     +-------------------------------------------------------+
     |   x     y       yhat       stdp      lower      upper |
     |-------------------------------------------------------|
  1. |  80   399    347.982    10.3628    326.545   369.4191 |
  2. |  30   121   169.4719   16.96974   134.3673   204.5765 |
  3. |  50   221    240.876   11.97934   216.0948   265.6571 |
  4. |  90   376   383.6841   11.97934   358.9029   408.4652 |
  5. |  70   361     312.28   9.764662   292.0803   332.4797 |
     |-------------------------------------------------------|
  6. |  60   224    276.578    10.3628   255.1409    298.015 |
  7. | 120   546   490.7901   19.90786   449.6075   531.9727 |
  8. |  80   352    347.982    10.3628    326.545   369.4191 |
  9. | 100   353    419.386   14.27233   389.8615   448.9106 |
 10. |  50   157    240.876   11.97934   216.0948   265.6571 |
     |-------------------------------------------------------|
 11. |  40   160   205.1739   14.27233   175.6494   234.6985 |
 12. |  70   252     312.28   9.764662   292.0803   332.4797 |
 13. |  90   389   383.6841   11.97934   358.9029   408.4652 |
 14. |  20   113   133.7699   19.90786   92.58736   174.9524 |
 15. | 110   435   455.0881   16.96974   419.9835   490.1927 |
     |-------------------------------------------------------|
 16. | 100   420    419.386   14.27233   389.8615   448.9106 |
 17. |  30   212   169.4719   16.96974   134.3673   204.5765 |
 18. |  50   268    240.876   11.97934   216.0948   265.6571 |
 19. |  90   377   383.6841   11.97934   358.9029   408.4652 |
 20. | 110   421   455.0881   16.96974   419.9835   490.1927 |
     |-------------------------------------------------------|
 21. |  30   273   169.4719   16.96974   134.3673   204.5765 |
 22. |  90   468   383.6841   11.97934   358.9029   408.4652 |
 23. |  40   244   205.1739   14.27233   175.6494   234.6985 |
 24. |  80   342    347.982    10.3628    326.545   369.4191 |
 25. |  70   323     312.28   9.764662   292.0803   332.4797 |
     +-------------------------------------------------------+

margins, at(x=(30 90 40 80 70)) vsquish // Last 5 values of x

Adjusted predictions                              Number of obs   =         25
Model VCE    : OLS

Expression   : Linear prediction, predict()
1._at        : x               =          30
2._at        : x               =          90
3._at        : x               =          40
4._at        : x               =          80
5._at        : x               =          70

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   169.4719   16.96974     9.99   0.000     134.3673    204.5765
          2  |    383.684   11.97934    32.03   0.000     358.9029    408.4652
          3  |   205.1739   14.27233    14.38   0.000     175.6494    234.6985
          4  |    347.982    10.3628    33.58   0.000     326.5449    369.4191
          5  |     312.28   9.764662    31.98   0.000     292.0803    332.4797
------------------------------------------------------------------------------

// margins is generating the same SE and CI as predict.
// This is what SPSS calls the mean CI.
// Keowmani and I are asking how to get the individual CI,
// or prediction interval.

These examples show that predict stdp, stdp is producing the same standard error as margins, and that this approach is not giving the prediction interval Keowmani is looking for.

Cheers,
Bruce

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Richard Williams

Join Date: Apr 2014
Posts: 4852

#11

01 Dec 2016, 16:28

Ok. So if I tweak your last few lines, I get

Code:

. predict stdp, stdp

. predict stdf, stdf

. generate lowerp = yhat - tcrit*stdp

. generate upperp = yhat + tcrit*stdp

. generate lowerf = yhat - tcrit*stdf

. generate upperf = yhat + tcrit*stdf

. list

     +----------------------------------------------------------------------------------------+
     |   x     y       yhat       stdp       stdf     lowerp     upperp     lowerf     upperf |
     |----------------------------------------------------------------------------------------|
  1. |  80   399    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
  2. |  30   121   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
  3. |  50   221    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
  4. |  90   376   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
  5. |  70   361     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
     |----------------------------------------------------------------------------------------|
  6. |  60   224    276.578    10.3628   49.91095   255.1409    298.015   173.3293   379.8267 |
  7. | 120   546   490.7901   19.90786   52.72607   449.6075   531.9727   381.7179   599.8623 |
  8. |  80   352    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
  9. | 100   353    419.386   14.27233   50.86664   389.8615   448.9106   314.1604   524.6117 |
 10. |  50   157    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
     |----------------------------------------------------------------------------------------|
 11. |  40   160   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
 12. |  70   252     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
 13. |  90   389   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
 14. |  20   113   133.7699   19.90786   52.72607   92.58736   174.9524   24.69771   242.8421 |
 15. | 110   435   455.0881   16.96974   51.68837   419.9835   490.1927   348.1625   562.0136 |
     |----------------------------------------------------------------------------------------|
 16. | 100   420    419.386   14.27233   50.86664   389.8615   448.9106   314.1604   524.6117 |
 17. |  30   212   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
 18. |  50   268    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
 19. |  90   377   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
 20. | 110   421   455.0881   16.96974   51.68837   419.9835   490.1927   348.1625   562.0136 |
     |----------------------------------------------------------------------------------------|
 21. |  30   273   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
 22. |  90   468   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
 23. |  40   244   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
 24. |  80   342    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
 25. |  70   323     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
     +----------------------------------------------------------------------------------------+

.

Which is the same output that SPSS produced (with stdf, the standard error of the forecast, included). So does this solve the problem? If you wanted it for x = some specific value, you could do something like replace x = 50 before computing stdp and stdf. If you wanted to plug in different values of x, you could do something cutesie like

replace x = 50 in 1
replace x = 60 in 2
replace x = 70 in 3
.
.
.
list in 1/3

margins doesn't seem to want to make it easy to get exactly what you want but it seems like predict commands can do it for you.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Richard Williams

Join Date: Apr 2014
Posts: 4852

#12

01 Dec 2016, 16:36

To explicitly show what I just said about plugging in different x values,

Code:

. replace x = 30 in 1
(1 real change made)

. replace x = 90 in 2
(1 real change made)

. replace x = 40 in 3
(1 real change made)

. replace x = 80 in 4
(1 real change made)

. replace x = 70 in 5
(0 real changes made)

. predict yhat in 1/5
(option xb assumed; fitted values)
(20 missing values generated)

. predict stdp in 1/5, stdp
(20 missing values generated)

. predict stdf in 1/5, stdf
(20 missing values generated)

. generate lowerp in 1/5 = yhat - tcrit*stdp
(20 missing values generated)

. generate upperp in 1/5 = yhat + tcrit*stdp
(20 missing values generated)

. generate lowerf in 1/5 = yhat - tcrit*stdf
(20 missing values generated)

. generate upperf in 1/5 = yhat + tcrit*stdf
(20 missing values generated)

. list in 1/5

     +---------------------------------------------------------------------------------------+
     |  x     y       yhat       stdp       stdf     lowerp     upperp     lowerf     upperf |
     |---------------------------------------------------------------------------------------|
  1. | 30   399   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
  2. | 90   121   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
  3. | 40   221   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
  4. | 80   376    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
  5. | 70   361     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
     +---------------------------------------------------------------------------------------+

So again, not as easy as it might be, but is that what is wanted?

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Bruce Weaver

Join Date: May 2014

Posts: 1094
#13

01 Dec 2016, 17:08

(Responding to #11 & #12)

Hi Richard. Yes, using stdf (standard error of the forecast) cracks it, I think. I didn't know about stdf--and by the way, I can't find it when I type help predict and then do a CTL-F search for it in the viewer. (Is this a version issue, perhaps? I'm using v13.1 for Windows.) I can find it in the PDF documentation, though. Thanks for pointing it out and providing the example.

Agreed that it's not quite as easy as it could be. Perhaps it will become an option for margins someday. ;-)

Cheers,
Bruce

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment

Richard Williams

Join Date: Apr 2014
Posts: 4852

#14

01 Dec 2016, 17:29

Or, to make it a little tidier,

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(x y)
 80 399
 30 121
 50 221
 90 376
 70 361
 60 224
120 546
 80 352
100 353
 50 157
 40 160
 70 252
 90 389
 20 113
110 435
100 420
 30 212
 50 268
 90 377
110 421
 30 273
 90 468
 40 244
 80 342
 70 323
end

clonevar x2 = x
regress y x2
matrix table = r(table)
scalar tcrit = table[8,1]
display "Critical t = " tcrit
local n = 0


foreach j in 30 90 40 80 70 {
    local n = `n' + 1
    quietly replace x2 = `j' in `n' 
}
quietly predict yhat in 1/`n'
quietly predict stdp in 1/`n', stdp
quietly generate lowerp in 1/`n' = yhat - tcrit*stdp
quietly generate upperp in 1/`n' = yhat + tcrit*stdp

quietly predict stdf in 1/`n', stdf
quietly generate lowerf in 1/`n' = yhat - tcrit*stdf
quietly generate upperf in 1/`n' = yhat + tcrit*stdf

list in 1/`n'


     +--------------------------------------------------------------------------------------------+
     |  x     y   x2       yhat       stdp     lowerp     upperp       stdf     lowerf     upperf |
     |--------------------------------------------------------------------------------------------|
  1. | 80   399   30   169.4719   16.96974   134.3673   204.5765   51.68837   62.54638   276.3975 |
  2. | 30   121   90   383.6841   11.97934   358.9029   408.4652   50.27147   279.6896   487.6785 |
  3. | 50   221   40   205.1739   14.27233   175.6494   234.6985   50.86664   99.94828   310.3996 |
  4. | 90   376   80    347.982    10.3628    326.545   369.4191   49.91095   244.7334   451.2307 |
  5. | 70   361   70     312.28   9.764662   292.0803   332.4797    49.7902   209.2811   415.2789 |
     +--------------------------------------------------------------------------------------------+

If this is something you were doing a lot you could probably create a do or ado file where you passed values to it.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4852
#15

01 Dec 2016, 17:37

help predict won't show it. That just gives you generic options. You need

help regress_postestimation

and then click on predict. That will show you the predict options that apply specically to regress.

Incidentally, Bruce, you need to yell at whoever buys your software and tell them you can't be expected to work under such barbaric conditions. The help in 14.2 is much better than 13.1, especially for things like margins.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement