Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Out of Sample/Scope Prediction with CI

    I need some help to get more details for Examples 2&3 to work
    (page 7 in regress postestimation — Postestimation tools for regress ) https://www.stata.com/manuals/rregre...estimation.pdf.

    I need to add a few independent variable observations after the regression and ANOVA analysis and then predict the prediction interval and individual value.

    I used the following program to insert an observation into the dataset and prompt the user for an out of scope prediction:
    Code:
    capture program drop OutOfSamplePredict
    
    
    program OutOfSamplePredict
    
    *ask user for the out of sample dependent var
        display "This will create an out-of-sample prediciton..."
            display ""
        display "Please enter your Dependant variable, for the Y Axis:" _request(dependant)
            display "Thank you, to verify:"
            display "The Dependant variable selected, for the Y Axis, is:$dependant"
        display "Please enter the INDependant variable used for your regression - the X Axis:" _request(INDependant_var)
            display ""
            display "Thank you, to verify:"
            display "The InDependant variable you entered is:$INDependant_var"
            display ""
        display "Please enter the value you wish to predict,  (Dependent variable out-of-sample value):" _request(INDependant_value)
            display ""
        display "Thank you, to verify:"
        display "The value you want a prediciton for is: $INDependant_value"
        
    *run the regression Quietly
        quietly regress $dependant $INDependant_var
        
    * create a new observation
        quietly insobs 1
        
    *insert the new Out of Sample observation that the user entered  into the INDependant variable 
        replace $INDependant_var = $INDependant_value in l
    
    *create a local macro name to store the  prediciton value to.
    
        local mypred = _b[_cons] + _b[$INDependant_var]*$INDependant_value
    
        
        display "your linear prediction equation : " _b[_cons] " + " _b[$INDependant_var] " * " $INDependant_value 
            display ""
        display "prediction value is = `mypred'"
            display ""
        display "writing your values to the dataset..."
            
            
        predict predict, xb
        predict residuals, residuals
        predict CooksDistance, cooksd
        predict StandardResiduals, rstandard
        predict Leverage, leverage
        
    * insert the prediction value based on the regression equation
        replace predict = `mypred' in l
    
        
        
    * generate leverage for the predicted value
    /*
        predict temp_leverage in l, leverage      /* because predict can only make new variables create a temp variable  */
        replace Leverage = temp_leverage in l     /* replace the only created value into the replacement variable */
        drop temp_leverage                        /* drop the temporary variable */
    */
    
    * generate Cooks Distance for the predicted value
    /*
        predict temp_cooks in l, cooksd      /* because predict can only make new variables create a temp variable  */
        replace CooksDistance = temp_cooks in l     /* replace the only created value into the replacement variable */
        drop temp_cooks                        /* drop the temporary variable */
    */    
    * generate Standard Error for the predicted value
    
        predict SE_Forecast in l, stdf      
        predict SE_Predict in l, stdp
        predict SE_Residual in l, stdr
        
    * generate the Confidence interval for the out of scope prediction
        local 2SD = SE_Forecast[_N] * 2
        local UCL = predict[_N] + `2SD'
        local LCL = predict[_N] - `2SD'
        
        display "The Upper bound for your Confidence Interval is: `UCL'" 
            display ""
        display "The LOWER bound for your Confidence Interval is: `LCL'" 
    
    
        
    
    
    
    end


    I followed the examples but I got a different answer than our instructor who used "R". This is their code and answer:

    > new.amount <- data.frame(pectin=c(1.5))
    > predict(model,newdata=new.amount,interval="predict ion")
    fit lwr upr
    1 62.5725 53.19595 71.94905


    Dataset

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float firmness byte pectin
     46.9 0
     50.2 0
     51.3 0
    56.48 1
    59.34 1
    62.97 1
    67.91 2
    70.78 2
    73.67 2
    68.13 3
    70.85 3
    72.34 3
    end

  • #2
    Never mind, I found and adapted a solution:

    Code:
    capture program drop OutOfSamplePredict
    
    
    program OutOfSamplePredict
    
    *ask user for the out of sample dependent var
        display "This will create an out-of-sample prediciton..."
            display ""
        display "Please enter your Dependant variable, for the Y Axis:" _request(dependant)
            display "Thank you, to verify:"
            display "The Dependant variable selected, for the Y Axis, is:$dependant"
        display "Please enter the INDependant variable used for your regression - the X Axis:" _request(INDependant_var)
            display ""
            display "Thank you, to verify:"
            display "The InDependant variable you entered is:$INDependant_var"
            display ""
        display "Please enter the value you wish to predict,  (Dependent variable out-of-sample value):" _request(INDependant_value)
            display ""
        display "Thank you, to verify:"
        display "The value you want a prediciton for is: $INDependant_value"
        
    *run the regression 
         regress $dependant $INDependant_var
        
    * create a new observation
        quietly insobs 1
        
    *insert the new Out of Sample observation that the user entered  into the INDependant variable 
        replace $INDependant_var = $INDependant_value in l
    
    *create a local macro name to store the  prediciton value to.
    
        local myY_Hat = _b[_cons] + _b[$INDependant_var]*$INDependant_value
    
        
        display "your linear prediction equation : " _b[_cons] " + " _b[$INDependant_var] " * " $INDependant_value 
            display ""
        display "prediction value is = `myY_Hat'"
            display ""
        display "writing your values to the dataset..."
            
            
        predict Y_hat, xb
        predict residuals, residuals
        predict CooksDistance, cooksd
        predict StandardResiduals, rstandard
        predict Leverage, leverage
        
    * insert the prediction value based on the regression equation
        replace Y_hat = `myY_Hat' in l
    
        
        
    * generate leverage for the predicted value
    /*
        predict temp_leverage in l, leverage      /* because predict can only make NEW variables create a temp variable  */
        replace Leverage = temp_leverage in l     /* replace the only created value into the variable where you want predict to ADD an entry */
        drop temp_leverage                        /* drop the temporary variable */
    */
    
    
    
        
    * generate the Confidence Interval and Prediction Interval for the out of scope prediction
        *help from https://www.ics.uci.edu/~jutts/110-201-09/StataForCIandPI.doc
        
        * generate Standard Errors for the predicted value
    
            predict PI_SE_Forecast in l, stdf      
            predict CI_SE_Predict in l, stdp
            predict SE_Residual in l, stdr
        *Generate the t-multiplier for a 95% interval and the appropriate degrees of freedom from the regression. 
            local tmult=invttail(e(df_r),.025)
        *Generate the lower and upper end points for each interval
            generate lowerCI = Y_hat - `tmult'*CI_SE_Predict
            generate upperCI = Y_hat + `tmult'*CI_SE_Predict
            generate lowerPI = Y_hat - `tmult'*PI_SE_Forecast
            generate upperPI = Y_hat + `tmult'*PI_SE_Forecast
    
        
        
        display "The Prediction Interval for the point you entered is:" 
            list Y_hat lowerPI upperPI in l, clean
            display "This corresponds to the range that a new  observation would fall inbetween 95% of the time. "
            display ""
            display ""
        display "The Confidence Interval for the point you entered is: " 
            list Y_hat lowerCI upperCI in l, clean
            display "This corresponds to the Average that a new set of observations would fall inbetween 95% of the time. "
            display ""
    
    
        
    
    
    
    end

    Comment

    Working...
    X