Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get prediction limits using margins command?

    Hi all,

    After regress y x, the command margins, at(x = a) will give me .95 confidence limits. That is fine if I want to estimate a mean response at x = a. But, if I want to get a prediction at x = a, what is the command?

    p/s: Currently, I have to calculate the spred manually to get the prediction limits. But, it get tedious when I need to correct the prediction limits for multiple predictions.

    Thank you in advance for any advice!

  • #2
    It is not clear to me what you want. Perhaps you could post the commands and output for what you are getting and then explain what you want instead.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    Stata Version: 17.0 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Attached is the output. The response is y and the predictor is x. I would like to get .95 prediction limits of y when x = 100. However, when using post-estimation command margins, at(x = 100), the result given is .95 confidence interval. I have to calculate the standard deviation of prediction s{pred} and manually to get the lower and upper limits of prediction. Can margins be used to get the prediction limits?

      Thanks!

      Attached Files

      Comment


      • #4
        It is better to post output using code tags; see pt 12 of the FAQ. People are understandably reluctant to open attachments (which could contain viruses) and may not have the needed software to read the attachment anyway.

        Throwing caution to the wind, I did open the attachment. It says the adjusted prediction is 419.3861. It gives you the 95% CI. I still don't understand why you are not happy with that. It also isn't clear to me why you used some of the numbers you did, e.g. 2383.71562. Perhaps if you post your output using code tags, someone else will understand your problem better than I am.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          regress y x

          margins, at(x = 100)

          The commands above will do if I want to estimate the mean response, Yh, when x = 100. However, I don't want that. Instead, I want to predict a new observation, Yh(new), when x = 100.

          Now, the 1 - alpha confidence limits for Yh is: Yh +- t(1 - alpha/2; n - 2)*s{Yh}. For alpha = .05, the .95 confidence limits are exactly as given by the above commands.

          Next, the 1 - alpha prediction limits for Yh(new) is: Yh(new) +- t(1 - alpha/2; n -2)*s{pred}. Where s2{pred} = MSE + s2{Yh}.

          Can margins provide the latter?

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input int(x y)
           80 399
           30 121
           50 221
           90 376
           70 361
           60 224
          120 546
           80 352
          100 353
           50 157
           40 160
           70 252
           90 389
           20 113
          110 435
          100 420
           30 212
           50 268
           90 377
          110 421
           30 273
           90 468
           40 244
           80 342
           70 323
          end

          Comment


          • #6
            Is this closer to what you are thinking about? It doesn't make much sense with only one x, but it might make sense if there were more than one. You could use gen statements to compute the upper and lower bounds of the prediction for each case.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(x y)
             80 399
             30 121
             50 221
             90 376
             70 361
             60 224
            120 546
             80 352
            100 353
             50 157
             40 160
             70 252
             90 389
             20 113
            110 435
            100 420
             30 212
             50 268
             90 377
            110 421
             30 273
             90 468
             40 244
             80 342
             70 323
            end
            clonevar x2 = x
            reg y x2
            replace x2 = 100
            predict yhat
            predict stdp, stdp
            list
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            Stata Version: 17.0 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              predict stdp, stdp will give s{Yh}, which is exactly the same as that given by the post-estimation command margins, at(x = 100). To get s{pred}, the command is predict spred, stdf.

              Comment


              • #8
                OK. So does that give you what you want? I don't see how margins will give it to you.
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                Stata Version: 17.0 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment


                • #9
                  Keowmani, thank you for raising this question. It was something I was also wondering about, but hadn't got around to asking.

                  For those who may not be clear on what the issue is, perhaps Appendix 2 in my linear regression slides will clarify things--scroll down to page 30 in this PDF.
                  • Slide 177 shows a dialog box from the SPSS chart editor, with None, Mean and Individual as the options for a confidence interval on the regression line.
                  • Slide 178 shows a scatter-plot (made with SPSS) showing both the mean and individual confidence intervals. (The "individual" CI is what Keowmani is calling the prediction interval).
                  • Slide 179 shows a plot of the same data made via margins & marginsplot. That confidence interval pretty clearly matches the mean CI from SPSS.
                  • The remainder of the slides in Appendix 2 show how the formulae for the mean and individual CIs differ. The only difference is that the formula for the individual CI (or prediction interval) has 1 added to the leverage under the square root sign. That is what makes the individual CI wider than the mean CI.
                  • By the way, this tutorial shows that JMP also has easy methods for producing both the mean and individual CIs.
                  I think the main point of Keowmani's first post was that margins & marginsplot do not appear to have the same Individual CI (or prediction interval) option that SPSS has. That was my conclusion too (using v13.1).

                  HTH.
                  --
                  Bruce Weaver
                  Email: [email protected]
                  Version: Stata/MP 18.5 (Windows)

                  Comment


                  • #10
                    Following up on #6 and #7, here are some examples using both SPSS and Stata that I hope will clarify things.

                    Code:
                    * --- Start of SPSS example ----.
                    
                    DATA LIST LIST / x y (2F5.0).
                    BEGIN DATA
                     80 399
                     30 121
                     50 221
                     90 376
                     70 361
                     60 224
                    120 546
                     80 352
                    100 353
                     50 157
                     40 160
                     70 252
                     90 389
                     20 113
                    110 435
                    100 420
                     30 212
                     50 268
                     90 377
                    110 421
                     30 273
                     90 468
                     40 244
                     80 342
                     70 323
                    END DATA.
                    REGRESSION
                      /CRITERIA=PIN(.05) POUT(.10) CIN(95)
                      /DEPENDENT y
                      /METHOD=ENTER x
                      /SAVE PRED SEPRED MCIN ICIN.
                    * REGRESSION output not shown .
                    LIST.
                    
                        x     y       PRE_1       SEP_1      LMCI_1      UMCI_1      LICI_1      UICI_1
                     
                       80   399   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
                       30   121   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
                       50   221   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
                       90   376   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
                       70   361   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
                       60   224   276.57798    10.36280   255.14090   298.01506   173.32931   379.82665
                      120   546   490.79010    19.90786   449.60756   531.97264   381.71792   599.86229
                       80   352   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
                      100   353   419.38606    14.27233   389.86150   448.91062   314.16040   524.61172
                       50   157   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
                       40   160   205.17394    14.27233   175.64938   234.69850    99.94828   310.39960
                       70   252   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
                       90   389   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
                       20   113   133.76990    19.90786    92.58736   174.95244    24.69771   242.84208
                      110   435   455.08808    16.96974   419.98350   490.19266   348.16254   562.01362
                      100   420   419.38606    14.27233   389.86150   448.91062   314.16040   524.61172
                       30   212   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
                       50   268   240.87596    11.97934   216.09481   265.65710   136.88151   344.87041
                       90   377   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
                      110   421   455.08808    16.96974   419.98350   490.19266   348.16254   562.01362
                       30   273   169.47192    16.96974   134.36734   204.57650    62.54638   276.39746
                       90   468   383.68404    11.97934   358.90290   408.46519   279.68959   487.67849
                       40   244   205.17394    14.27233   175.64938   234.69850    99.94828   310.39960
                       80   342   347.98202    10.36280   326.54494   369.41910   244.73335   451.23069
                       70   323   312.28000     9.76466   292.08026   332.47974   209.28112   415.27888
                     
                    Number of cases read:  25    Number of cases listed:  25
                    
                    * PRE_1  = Unstandardized Predicted Value
                    * SEP_1  = Standard Error of Predicted Value
                    * LMCI_1 = 95% L CI for y mean
                    * UMCI_1 = 95% U CI for y mean
                    * LICI_1 = 95% L CI for y individual
                    * UICI_1 = 95% U CI for y individual.
                    
                    * ----- End of SPSS example --------.
                    And now a Stata example using the same data (with only the needed output included).

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input int(x y)
                     80 399
                     30 121
                     50 221
                     90 376
                     70 361
                     60 224
                    120 546
                     80 352
                    100 353
                     50 157
                     40 160
                     70 252
                     90 389
                     20 113
                    110 435
                    100 420
                     30 212
                     50 268
                     90 377
                    110 421
                     30 273
                     90 468
                     40 244
                     80 342
                     70 323
                    end
                    
                    regress y x
                    matrix table = r(table)
                    scalar tcrit = table[8,1]
                    display "Critical t = " tcrit
                    
                    predict yhat
                    predict stdp, stdp
                    generate lower = yhat - tcrit*stdp
                    generate upper = yhat + tcrit*stdp
                    list
                    
                         +-------------------------------------------------------+
                         |   x     y       yhat       stdp      lower      upper |
                         |-------------------------------------------------------|
                      1. |  80   399    347.982    10.3628    326.545   369.4191 |
                      2. |  30   121   169.4719   16.96974   134.3673   204.5765 |
                      3. |  50   221    240.876   11.97934   216.0948   265.6571 |
                      4. |  90   376   383.6841   11.97934   358.9029   408.4652 |
                      5. |  70   361     312.28   9.764662   292.0803   332.4797 |
                         |-------------------------------------------------------|
                      6. |  60   224    276.578    10.3628   255.1409    298.015 |
                      7. | 120   546   490.7901   19.90786   449.6075   531.9727 |
                      8. |  80   352    347.982    10.3628    326.545   369.4191 |
                      9. | 100   353    419.386   14.27233   389.8615   448.9106 |
                     10. |  50   157    240.876   11.97934   216.0948   265.6571 |
                         |-------------------------------------------------------|
                     11. |  40   160   205.1739   14.27233   175.6494   234.6985 |
                     12. |  70   252     312.28   9.764662   292.0803   332.4797 |
                     13. |  90   389   383.6841   11.97934   358.9029   408.4652 |
                     14. |  20   113   133.7699   19.90786   92.58736   174.9524 |
                     15. | 110   435   455.0881   16.96974   419.9835   490.1927 |
                         |-------------------------------------------------------|
                     16. | 100   420    419.386   14.27233   389.8615   448.9106 |
                     17. |  30   212   169.4719   16.96974   134.3673   204.5765 |
                     18. |  50   268    240.876   11.97934   216.0948   265.6571 |
                     19. |  90   377   383.6841   11.97934   358.9029   408.4652 |
                     20. | 110   421   455.0881   16.96974   419.9835   490.1927 |
                         |-------------------------------------------------------|
                     21. |  30   273   169.4719   16.96974   134.3673   204.5765 |
                     22. |  90   468   383.6841   11.97934   358.9029   408.4652 |
                     23. |  40   244   205.1739   14.27233   175.6494   234.6985 |
                     24. |  80   342    347.982    10.3628    326.545   369.4191 |
                     25. |  70   323     312.28   9.764662   292.0803   332.4797 |
                         +-------------------------------------------------------+
                    
                    margins, at(x=(30 90 40 80 70)) vsquish // Last 5 values of x
                    
                    Adjusted predictions                              Number of obs   =         25
                    Model VCE    : OLS
                    
                    Expression   : Linear prediction, predict()
                    1._at        : x               =          30
                    2._at        : x               =          90
                    3._at        : x               =          40
                    4._at        : x               =          80
                    5._at        : x               =          70
                    
                    ------------------------------------------------------------------------------
                                 |            Delta-method
                                 |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             _at |
                              1  |   169.4719   16.96974     9.99   0.000     134.3673    204.5765
                              2  |    383.684   11.97934    32.03   0.000     358.9029    408.4652
                              3  |   205.1739   14.27233    14.38   0.000     175.6494    234.6985
                              4  |    347.982    10.3628    33.58   0.000     326.5449    369.4191
                              5  |     312.28   9.764662    31.98   0.000     292.0803    332.4797
                    ------------------------------------------------------------------------------
                    
                    // margins is generating the same SE and CI as predict.
                    // This is what SPSS calls the mean CI.
                    // Keowmani and I are asking how to get the individual CI,
                    // or prediction interval.
                    These examples show that predict stdp, stdp is producing the same standard error as margins, and that this approach is not giving the prediction interval Keowmani is looking for.

                    Cheers,
                    Bruce
                    --
                    Bruce Weaver
                    Email: [email protected]
                    Version: Stata/MP 18.5 (Windows)

                    Comment


                    • #11
                      Ok. So if I tweak your last few lines, I get

                      Code:
                      . predict stdp, stdp
                      
                      . predict stdf, stdf
                      
                      . generate lowerp = yhat - tcrit*stdp
                      
                      . generate upperp = yhat + tcrit*stdp
                      
                      . generate lowerf = yhat - tcrit*stdf
                      
                      . generate upperf = yhat + tcrit*stdf
                      
                      . list
                      
                           +----------------------------------------------------------------------------------------+
                           |   x     y       yhat       stdp       stdf     lowerp     upperp     lowerf     upperf |
                           |----------------------------------------------------------------------------------------|
                        1. |  80   399    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
                        2. |  30   121   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
                        3. |  50   221    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
                        4. |  90   376   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
                        5. |  70   361     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
                           |----------------------------------------------------------------------------------------|
                        6. |  60   224    276.578    10.3628   49.91095   255.1409    298.015   173.3293   379.8267 |
                        7. | 120   546   490.7901   19.90786   52.72607   449.6075   531.9727   381.7179   599.8623 |
                        8. |  80   352    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
                        9. | 100   353    419.386   14.27233   50.86664   389.8615   448.9106   314.1604   524.6117 |
                       10. |  50   157    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
                           |----------------------------------------------------------------------------------------|
                       11. |  40   160   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
                       12. |  70   252     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
                       13. |  90   389   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
                       14. |  20   113   133.7699   19.90786   52.72607   92.58736   174.9524   24.69771   242.8421 |
                       15. | 110   435   455.0881   16.96974   51.68837   419.9835   490.1927   348.1625   562.0136 |
                           |----------------------------------------------------------------------------------------|
                       16. | 100   420    419.386   14.27233   50.86664   389.8615   448.9106   314.1604   524.6117 |
                       17. |  30   212   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
                       18. |  50   268    240.876   11.97934   50.27147   216.0948   265.6571   136.8815   344.8704 |
                       19. |  90   377   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
                       20. | 110   421   455.0881   16.96974   51.68837   419.9835   490.1927   348.1625   562.0136 |
                           |----------------------------------------------------------------------------------------|
                       21. |  30   273   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
                       22. |  90   468   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
                       23. |  40   244   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
                       24. |  80   342    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
                       25. |  70   323     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
                           +----------------------------------------------------------------------------------------+
                      
                      .
                      Which is the same output that SPSS produced (with stdf, the standard error of the forecast, included). So does this solve the problem? If you wanted it for x = some specific value, you could do something like replace x = 50 before computing stdp and stdf. If you wanted to plug in different values of x, you could do something cutesie like

                      replace x = 50 in 1
                      replace x = 60 in 2
                      replace x = 70 in 3
                      .
                      .
                      .
                      list in 1/3

                      margins doesn't seem to want to make it easy to get exactly what you want but it seems like predict commands can do it for you.
                      -------------------------------------------
                      Richard Williams, Notre Dame Dept of Sociology
                      Stata Version: 17.0 MP (2 processor)

                      EMAIL: [email protected]
                      WWW: https://www3.nd.edu/~rwilliam

                      Comment


                      • #12
                        To explicitly show what I just said about plugging in different x values,

                        Code:
                        . replace x = 30 in 1
                        (1 real change made)
                        
                        . replace x = 90 in 2
                        (1 real change made)
                        
                        . replace x = 40 in 3
                        (1 real change made)
                        
                        . replace x = 80 in 4
                        (1 real change made)
                        
                        . replace x = 70 in 5
                        (0 real changes made)
                        
                        . predict yhat in 1/5
                        (option xb assumed; fitted values)
                        (20 missing values generated)
                        
                        . predict stdp in 1/5, stdp
                        (20 missing values generated)
                        
                        . predict stdf in 1/5, stdf
                        (20 missing values generated)
                        
                        . generate lowerp in 1/5 = yhat - tcrit*stdp
                        (20 missing values generated)
                        
                        . generate upperp in 1/5 = yhat + tcrit*stdp
                        (20 missing values generated)
                        
                        . generate lowerf in 1/5 = yhat - tcrit*stdf
                        (20 missing values generated)
                        
                        . generate upperf in 1/5 = yhat + tcrit*stdf
                        (20 missing values generated)
                        
                        . list in 1/5
                        
                             +---------------------------------------------------------------------------------------+
                             |  x     y       yhat       stdp       stdf     lowerp     upperp     lowerf     upperf |
                             |---------------------------------------------------------------------------------------|
                          1. | 30   399   169.4719   16.96974   51.68837   134.3673   204.5765   62.54638   276.3975 |
                          2. | 90   121   383.6841   11.97934   50.27147   358.9029   408.4652   279.6896   487.6785 |
                          3. | 40   221   205.1739   14.27233   50.86664   175.6494   234.6985   99.94828   310.3996 |
                          4. | 80   376    347.982    10.3628   49.91095    326.545   369.4191   244.7334   451.2307 |
                          5. | 70   361     312.28   9.764662    49.7902   292.0803   332.4797   209.2811   415.2789 |
                             +---------------------------------------------------------------------------------------+
                        So again, not as easy as it might be, but is that what is wanted?
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        Stata Version: 17.0 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          (Responding to #11 & #12)

                          Hi Richard. Yes, using stdf (standard error of the forecast) cracks it, I think. I didn't know about stdf--and by the way, I can't find it when I type help predict and then do a CTL-F search for it in the viewer. (Is this a version issue, perhaps? I'm using v13.1 for Windows.) I can find it in the PDF documentation, though. Thanks for pointing it out and providing the example.

                          Agreed that it's not quite as easy as it could be. Perhaps it will become an option for margins someday. ;-)

                          Cheers,
                          Bruce
                          --
                          Bruce Weaver
                          Email: [email protected]
                          Version: Stata/MP 18.5 (Windows)

                          Comment


                          • #14
                            Or, to make it a little tidier,

                            Code:
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input int(x y)
                             80 399
                             30 121
                             50 221
                             90 376
                             70 361
                             60 224
                            120 546
                             80 352
                            100 353
                             50 157
                             40 160
                             70 252
                             90 389
                             20 113
                            110 435
                            100 420
                             30 212
                             50 268
                             90 377
                            110 421
                             30 273
                             90 468
                             40 244
                             80 342
                             70 323
                            end
                            
                            clonevar x2 = x
                            regress y x2
                            matrix table = r(table)
                            scalar tcrit = table[8,1]
                            display "Critical t = " tcrit
                            local n = 0
                            
                            
                            foreach j in 30 90 40 80 70 {
                                local n = `n' + 1
                                quietly replace x2 = `j' in `n' 
                            }
                            quietly predict yhat in 1/`n'
                            quietly predict stdp in 1/`n', stdp
                            quietly generate lowerp in 1/`n' = yhat - tcrit*stdp
                            quietly generate upperp in 1/`n' = yhat + tcrit*stdp
                            
                            quietly predict stdf in 1/`n', stdf
                            quietly generate lowerf in 1/`n' = yhat - tcrit*stdf
                            quietly generate upperf in 1/`n' = yhat + tcrit*stdf
                            
                            list in 1/`n'
                            
                            
                                 +--------------------------------------------------------------------------------------------+
                                 |  x     y   x2       yhat       stdp     lowerp     upperp       stdf     lowerf     upperf |
                                 |--------------------------------------------------------------------------------------------|
                              1. | 80   399   30   169.4719   16.96974   134.3673   204.5765   51.68837   62.54638   276.3975 |
                              2. | 30   121   90   383.6841   11.97934   358.9029   408.4652   50.27147   279.6896   487.6785 |
                              3. | 50   221   40   205.1739   14.27233   175.6494   234.6985   50.86664   99.94828   310.3996 |
                              4. | 90   376   80    347.982    10.3628    326.545   369.4191   49.91095   244.7334   451.2307 |
                              5. | 70   361   70     312.28   9.764662   292.0803   332.4797    49.7902   209.2811   415.2789 |
                                 +--------------------------------------------------------------------------------------------+
                            If this is something you were doing a lot you could probably create a do or ado file where you passed values to it.
                            -------------------------------------------
                            Richard Williams, Notre Dame Dept of Sociology
                            Stata Version: 17.0 MP (2 processor)

                            EMAIL: [email protected]
                            WWW: https://www3.nd.edu/~rwilliam

                            Comment


                            • #15
                              help predict won't show it. That just gives you generic options. You need

                              help regress_postestimation

                              and then click on predict. That will show you the predict options that apply specically to regress.

                              Incidentally, Bruce, you need to yell at whoever buys your software and tell them you can't be expected to work under such barbaric conditions. The help in 14.2 is much better than 13.1, especially for things like margins.
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              Stata Version: 17.0 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment

                              Working...
                              X