Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • QQ plot age residuals

    Hi guys,

    I'm trying to construct a QQ-plot for age and residuals using the following code:
    Code:
    gen residuals=HRQoL-HRQoL_predicted
    qqplot age residuals
    This gives me a normal looking QQ plot with a positively distributed population BUT there is something weird about the plot: the reference line (45 degrees normally?) is nowehere close to the QQ plot itself. It is a horizontal line which lies just above the x-axis.. Does anybody now how to solve this problem?

    Thanks for the help!
    Florian

  • #2
    I really do not know what you expect given a qqplot of residuals and age. Look at the following example

    Code:
    webuse auto
    qui reg price mpg weight
    predict r, residual
    In a number of cases, the residual is normally distributed with mean 0. A variable like "age" will have a non-zero mean and in most cases will not be normally distributed. Now the only time you would get the dots coinciding with the reference line is if you were to do a qqplot of a variable and itself. For example

    Code:
    qqplot r r
    Click image for larger version

Name:	qqplot1.png
Views:	1
Size:	10.4 KB
ID:	1319904


    However, all bets are off if you attempt to a qqplot of the residual, say with a variable like "length" in the auto dataset. Also, if you plot the 0-mean variable on the y-axis, the reference line will be horizontal at 0 (so check which variable is in the y-axis).

    Code:
    qqplot length r
    Click image for larger version

Name:	qqplot2.png
Views:	1
Size:	10.0 KB
ID:	1319905




    Code:
    qqplot r length

    Click image for larger version

Name:	qqplot3.png
Views:	1
Size:	10.5 KB
ID:	1319906





    Comment


    • #3
      That plot is to compare distributions. The example in the user's manual is the distribution of weights of automobiles made in the U.S. to that of those manufactured elsewhere.

      Is there a reason why you would expect age to be distributed identically with residuals of some regression model of health-related quality of life survey scores?

      Comment


      • #4
        Hello,

        Thank you very much for your answers! I'm sorry if i was not clear with my question. Maybe with the plot shown (below) it makes more sense. I do not expect age to be distributed identically with residuals ( I know it is skewed to the right for example). I'm just confused that the reference line in my plot is nowhere the same like shown in the plots of Andrew. Also when i do the QQ plot the other way around (residuals on x axis and age on y axis) no normal plot is shown.

        Click image for larger version

Name:	Knipsel1.PNG
Views:	1
Size:	5.0 KB
ID:	1319909


        Comment


        • #5
          I suspect that there is nothing wrong with the plot above. You will see this if you ask Stata to summarize the two variables

          Code:
          sum age
          sum residuals
          At no point do the values of age and residuals intersect. It appears that the mean of residuals is close to zero and the range does not depart much from zero. I can create the same behavior in your graph by generating a variable with a mean and range that does not depart much from zero.


          Code:
          webuse auto
          qui reg price mpg weight
          gen r_2= rnormal(.01, 0.001)
          qqplot length r_2

          Code:
          . sum r_2
          
          Variable | Obs   Mean     Std. Dev.  Min       Max
          -------------+--------------------------------------------------------
          r_2      | 74    .0100943  .0010237   .0077693   .0130819
          
          
          
          . sum length
          
          Variable | Obs   Mean      Std. Dev.   Min   Max
          -------------+--------------------------------------------------------
          length   | 74    187.9324  22.26634    142   233
          Click image for larger version

Name:	qq_1.png
Views:	1
Size:	10.1 KB
ID:	1319916




          Now, let us increase the variance of r_2 so that some values are in the range 142 - 233 (but same mean)

          Code:
          gen r_3= rnormal(.01, 200)
          Code:
          . sum r_3
          
          Variable | Obs    Mean      Std. Dev.   Min       Max
          -------------+--------------------------------------------------------
          r_3      | 74    -7.938616  201.4918   -547.6382  463.7131
          
          . count if r_3> 142
          18
          Code:
          qqplot length r_3
          Click image for larger version

Name:	qq_2.png
Views:	1
Size:	10.7 KB
ID:	1319917

          Last edited by Andrew Musau; 12 Dec 2015, 08:01.

          Comment


          • #6
            Originally posted by Florian Maissan View Post
            Also when i do the QQ plot the other way around (residuals on x axis and age on y axis) no normal plot is shown.
            No normal plot will be shown regardless of what you do with qqplot. Are you looking for qnorm?

            Comment

            Working...
            X