Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing linearity with a plot of standardized residuals

    Hi forum!
    I have a question for you. I tried to assess the linearity assumption of my multiple linear regression model by testing the "structure" of the standardized residuals against the values of my predictors; but I'm not so sure this is the best way to do that. I attached an example of what I've done.
    Code:
    . regress consumilog disoccup inattivitàlog dem_impreselog retrib_medialog componenti_f
    > amsqr
    
          Source |       SS       df       MS              Number of obs =     107
    -------------+------------------------------           F(  5,   101) =  161.28
           Model |  5.44754193     5  1.08950839           Prob > F      =  0.0000
        Residual |  .682297905   101  .006755425           R-squared     =  0.8887
    -------------+------------------------------           Adj R-squared =  0.8832
           Total |  6.12983984   106  .057828678           Root MSE      =  .08219
    
    -----------------------------------------------------------------------------------
           consumilog |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
             disoccup |  -.0139411   .0028199    -4.94   0.000     -.019535   -.0083473
        inattivitàlog |  -.6843887   .0814612    -8.40   0.000    -.8459859   -.5227915
       dem_impreselog |    .015953   .0126544     1.26   0.210      -.00915     .041056
      retrib_medialog |   .3489728   .1838609     1.90   0.061    -.0157578    .7137034
    componenti_famsqr |   .0385899   .0133636     2.89   0.005     .0120802    .0650997
                _cons |   6.341054   1.937408     3.27   0.001     2.497758    10.18435
    -----------------------------------------------------------------------------------
    Code:
    predict consumires, rstandard
    Code:
     scatter consumires disoccup

  • #2
    Note that this plot is (almost) wired in as rvfplot -- and it is available through rvfplot2, as from

    SJ-10-1 gr0009_1 . . . . Software update for model diagnostic graph commands
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
    (help anovaplot, indexplot, modeldiag, ofrtplot, ovfplot,
    qfrplot, racplot, rdplot, regplot, rhetplot, rvfplot2,
    rvlrplot, rvpplot2 if installed)
    Q1/10 SJ 10(1):164
    provides new command rbinplot for plotting means or medians
    of residuals by bins; provides new options for smoothing
    using restricted cubic splines; updates anova examples

    SJ-4-4 gr0009 . . . . . . . . . . Speaking Stata: Graphing model diagnostics
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
    (help anovaplot, indexplot, modeldiag, ofrtplot, ovfplot,
    qfrplot, racplot, rdplot, regplot, rhetplot, rvfplot2,
    rvlrplot, rvpplot2 if installed)
    Q4/04 SJ 4(4):449--475
    plotting diagnostic information calculated from residuals
    and fitted values from regression models with continuous
    responses



    Yes, it is a good idea, and we need to see yours to comment further.

    Detail: the name rdplot was used by me in 2004 for a particular plot and later for other purposes for a different command in a very popular package. No-one that I know of has yet complained of the name clash, but watch out.
    Last edited by Nick Cox; 18 Apr 2020, 02:15.

    Comment


    • #3
      Gabriella:
      you can also give the issue an analytical try via -estat ovtest- and, probably more interesting, -linktest-.
      Last edited by Carlo Lazzaro; 18 Apr 2020, 02:44.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        and what about the the command acprplot (augmented component-plus-residual plot)?

        Comment


        • #5
          Gabriella:
          personally, I find visual inspection more useful for detecting heteroskedasticity (which may well be a symptom of model misspecification).
          When it comes to model misspecification (such as investigating the existence of squared relationship between predicto(s) and regressand), I prefer an analytical approach.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Hi everyone I hope you are all well,

            I am trying to test the OLS regression assumption of no heteroskedasticity in Stata 16.1 before running this fixed effects regression model:

            xtreg DiffMeanHourlyPercent Year2019 Year2020, fe

            Which test would be appropriate for me to use given my independent variables are year dummies. I have not seen anything on the internet that specifically recommends a test to detect heteroskedasticity when there are only dummy independent variables present in a fixed effects regression model.

            Kind regards,

            Uyi Erhabor

            (Stata 16.1 SE)

            Comment


            • #7
              Uyi:
              some comments about your post:
              - you should -reshape- your data in -long- format, so that you can have a -year- variable that becomes your -timevar-;
              - under the -fe- specification, alle the time-invariant predictors will be wiped out;
              - the visula inspection of the epsilon residual is recommended after (not before) -xtreg-, as no anlytical test is included among Stata built-in command.


              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Hi Carlo,

                Thank you for your reply! Howcome the inspection should be carried out after the xtreg command?

                Kind regards,

                Uyi Erhabor

                (Stata 16.1 SE)

                Comment


                • #9
                  Hi everyone,

                  Please could someone explain if this scatterplot exhibits heteroskedasticity? If so could you explain why.

                  Kind regards,

                  Uyi Erhabor

                  (Stata 16.1 SE)
                  Attached Files

                  Comment


                  • #10
                    Uyi:
                    Code:
                    use "https://www.stata-press.com/data/r16/nlswork.dta"
                    
                    . xtreg ln_wage i.nev_mar i.collgrad, fe
                    note: 1.collgrad omitted because of collinearity
                    
                    Fixed-effects (within) regression               Number of obs     =     28,518
                    Group variable: idcode                          Number of groups  =      4,711
                    
                    R-sq:                                           Obs per group:
                         within  = 0.0263                                         min =          1
                         between = 0.0011                                         avg =        6.1
                         overall = 0.0032                                         max =         15
                    
                                                                    F(1,23806)        =     641.83
                    corr(u_i, Xb)  = -0.1414                        Prob > F          =     0.0000
                    
                    ------------------------------------------------------------------------------
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                       1.nev_mar |  -.1929812   .0076174   -25.33   0.000    -.2079118   -.1780507
                      1.collgrad |          0  (omitted)
                           _cons |   1.719339   .0025617   671.17   0.000     1.714318    1.724361
                    -------------+----------------------------------------------------------------
                         sigma_u |  .42825389
                         sigma_e |   .3159974
                             rho |  .64747634   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    F test that all u_i=0: F(4710, 23806) = 8.76                 Prob > F = 0.0000
                    
                    . predict e_res, e
                    (16 missing values generated)
                    
                    . qnorm e_res
                    
                    .
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X