Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • qqplotg now available from SSC

    Thanks to Kit Baum for his ever efficient support, qqplotg is now available from SSC. Stata 8.2 is needed.

    This is a variant on official command qqplot and thus needs some small explanations.

    I seem to have been writing small or even large variants on qqplot since at least 1998. Some of those variants may have been posted on the old Statalist, but if so they are now hard to find and retrieve.

    For the record:

    qqplot2 on SSC is a 1998 beast from me for Stata 5 and is not of current interest, unless it is called in a do-file or command you use, which is unlikely but not impossible, so we will leave it in peace on SSC.

    qqplot3 is a variant on SSC by Ariel Linden with leading point support for weights, which is fine.

    This one isn't called qqplot4 because it doesn't in any sense supersede qqplot3, but it does incorporate the ideas of supporting comparison of groups and of being able to generate new variables. So "g" seems to fit generalized, group, and generate, which is good enough for me.

    Recently I mentioned the idea of plotting difference versus mean, for quantiles, https://www.statalist.org/forums/for...v-smirnov-test,

    and indeed that thread reminded me of my earlier work. So I blew the dust off an unpublished 2011 program, which lay behind my support for Ariel, and added a help file.

    Here are some example applications.

    Code:
    sysuse auto, clear
    
    qqplotg mpg, group(foreign) title(miles per gallon) name(QQ1, replace)
    qqplotg mpg, group(foreign) flip title(miles per gallon) name(QQ2,replace)
    qqplotg mpg, group(foreign) flip diffvsmean title(miles per gallon) name(QQ3, replace)
    
    gen recmpg = 100/mpg
    qqplotg recmpg, group(foreign) diffvsmean title(gallons per 100 miles) name(QQ4, replace)
    gen lnmpg = ln(mpg)
    qqplotg recmpg, group(foreign) diffvsmean title(ln miles per gallon) name(QQ5, replace)
    QQ1 is a basic plot comparing domestic and foreign cars. The reference line is of equality and we don't see a parallel pattern, which is what is behind (notably) a t test which focuses on the difference between means.
    Click image for larger version

Name:	QQ1.png
Views:	1
Size:	27.2 KB
ID:	1693085


    Naturally, domestic and foreign cars aren't paired, but what you see are corresponding quantiles calculated by interpolation within the larger set. Think of the trickery as a simple extension of plotting minimum vs minimum, maximum vs maximum, median vs median, and so forth.


    QQ2 uses the flip option to swap the categories. So, you don't need to recode a group variable if the comparison strikes you as the wrong way round.
    Click image for larger version

Name:	QQ2.png
Views:	1
Size:	26.0 KB
ID:	1693086



    QQ3 plots differences versus means for paired quantiles. The combination of shift and tilt is even clearer.
    Click image for larger version

Name:	QQ3.png
Views:	1
Size:	23.3 KB
ID:	1693087



    So, we should be willing to look at the data on transformed scale -- which doesn't mean (ultimately) transforming the variable. It could mean using (e.g.) generalized linear models with a suitable link function.

    Reciprocals are a natural candidate on dimensional grounds. QQ4 shows the result. Logarithms may appeal otherwise and are strongly competitive QQ5 shows that result.

    (Otherwise put:as the range of the outcome here is only about 3-fold, that's not enough for reciprocal and logarithm to behave very differently.)
    Click image for larger version

Name:	QQ4.png
Views:	1
Size:	23.4 KB
ID:	1693088

    Click image for larger version

Name:	QQ5.png
Views:	1
Size:	23.1 KB
ID:	1693089

    Last edited by Nick Cox; 12 Dec 2022, 10:04.

  • #2
    It is not surprising that QQ4 and QQ5 look similar because they are identical.

    Here is the mistake fixed. I reversed the y axis scale because logarithm is an increasing function while reciprocal is a decreasing function for comparable positive arguments.

    Code:
    gen lnmpg = ln(mpg)
    qqplotg lnmpg, group(foreign) diffvsmean title(ln miles per gallon) ysc(reverse) name(QQ5, replace)
    Click image for larger version

Name:	QQ5.png
Views:	1
Size:	19.0 KB
ID:	1693123

    Comment


    • #3
      Kit Baum has kindly uploaded a revised help file fixing the problem flagged in #2. If you installed qqplotg, please replace using


      Code:
      ssc install qqplotg, replace

      Comment


      • #4
        Thanks as always to Kit Baum, qqplotg has been revised on SSC.

        The main improvements are

        * scope for transformation on the fly

        * support for plots of difference versus (cumulative) probability

        * smoothing of difference versus probability and difference versus mean plots

        * support for a by() option.

        qqplot has a long history as an official command going back to 1984. At the same time, its functionality hasn't changed much for a long time. The help for qqplot (official command) explains how to set up comparison between two unpaired groups, for example, but as explained in #1 qqplotg (this command) supports that directly. This is in my experience the most common application.

        The examples include some based on a classic dataset on ozone concentrations. Here are some details and the data for anyone wanting to play

        Code:
         1.  maximum daily ozone concentrations May to September 1974
          3.  Chambers, J.M., Cleveland, W.S., Kleiner, B. and Tukey, P.A. 1983. Graphical methods for data analysis.
              Brlmont, CA: Wadsworth, p.346
          4.  their sources: Stamford, CT Dept of Environmental Protection; Boyce Thompson Institute
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input int(stamford yonkers) byte(month day) str2 dayofwk
         66  47 5  1 "W" 
         52  37 5  2 "Th"
          .  27 5  3 "F" 
          .  37 5  4 "Sa"
          .  38 5  5 "Su"
          .   . 5  6 "M" 
         49  45 5  7 "Tu"
         64  52 5  8 "W" 
         68  51 5  9 "Th"
         26  22 5 10 "F" 
         86  27 5 11 "Sa"
         52  25 5 12 "Su"
         43   . 5 13 "M" 
         75  55 5 14 "Tu"
         87  72 5 15 "W" 
        188 132 5 16 "Th"
        118   . 5 17 "F" 
        103 106 5 18 "Sa"
         82  42 5 19 "Su"
         71  45 5 20 "M" 
        103  80 5 21 "Tu"
        240 107 5 22 "W" 
         31  21 5 23 "Th"
         40  50 5 24 "F" 
         47  31 5 25 "Sa"
         51  37 5 26 "Su"
         31  19 5 27 "M" 
         47  33 5 28 "Tu"
         14  22 5 29 "W" 
          .  67 5 30 "Th"
         71  45 5 31 "F" 
         61  36 6  1 "Sa"
         47  24 6  2 "Su"
          .  52 6  3 "M" 
        196  88 6  4 "Tu"
        131 111 6  5 "W" 
        173 117 6  6 "Th"
         37  31 6  7 "F" 
         47  37 6  8 "Sa"
        215  93 6  9 "Su"
        230 106 6 10 "M" 
          .  49 6 11 "Tu"
         69  64 6 12 "W" 
         98  83 6 13 "Th"
        125  97 6 14 "F" 
         94  79 6 15 "Sa"
         72  36 6 16 "Su"
         72  51 6 17 "M" 
        125  75 6 18 "Tu"
        143 104 6 19 "W" 
        192 107 6 20 "Th"
          .  56 6 21 "F" 
        122  68 6 22 "Sa"
         32  19 6 23 "Su"
        114  67 6 24 "M" 
         32  20 6 25 "Tu"
         23  35 6 26 "W" 
         71  30 6 27 "Th"
         38  31 6 28 "F" 
        136  81 6 29 "Sa"
        169 119 6 30 "Su"
        152  76 7  1 "M" 
        201 108 7  2 "Tu"
        134  85 7  3 "W" 
        206  96 7  4 "Th"
         92  48 7  5 "F" 
        101  60 7  6 "Sa"
        119  54 7  7 "Su"
        124  71 7  8 "M" 
        133   . 7  9 "Tu"
         83  50 7 10 "W" 
          .  27 7 11 "Th"
         60  37 7 12 "F" 
        124  47 7 13 "Sa"
        142  71 7 14 "Su"
        124  46 7 15 "M" 
         64  41 7 16 "Tu"
         75  49 7 17 "W" 
        103  59 7 18 "Th"
          .  53 7 19 "F" 
         46  25 7 20 "Sa"
         68  45 7 21 "Su"
          .  78 7 22 "M" 
         87  40 7 23 "Tu"
         27  13 7 24 "W" 
          .  25 7 25 "Th"
         73  46 7 26 "F" 
         59  62 7 27 "Sa"
        119  80 7 28 "Su"
         64  39 7 29 "M" 
          .  70 7 30 "Tu"
        111  74 7 31 "W" 
         80  66 8  1 "Th"
         68  82 8  2 "F" 
         24  47 8  3 "Sa"
         24  28 8  4 "Su"
         82  44 8  5 "M" 
        100  55 8  6 "Tu"
         55  34 8  7 "W" 
         91  60 8  8 "Th"
         87  70 8  9 "F" 
         64  41 8 10 "Sa"
          .  67 8 11 "Su"
          . 127 8 12 "M" 
        170  96 8 13 "Tu"
          .  56 8 14 "W" 
         86  54 8 15 "Th"
        202 100 8 16 "F" 
         71  44 8 17 "Sa"
         85  44 8 18 "Su"
        122  75 8 19 "M" 
        155  86 8 20 "Tu"
         80  70 8 21 "W" 
         71  53 8 22 "Th"
         28  36 8 23 "F" 
        212 117 8 24 "Sa"
         80  43 8 25 "Su"
         24  27 8 26 "M" 
         80  77 8 27 "Tu"
        169  75 8 28 "W" 
        174  87 8 29 "Th"
        141  47 8 30 "F" 
        202 114 8 31 "Sa"
        113  66 9  1 "Su"
         38  18 9  2 "M" 
         38  25 9  3 "Tu"
         28  14 9  4 "W" 
         52  27 9  5 "Th"
         14   9 9  6 "F" 
         38  16 9  7 "Sa"
         94  67 9  8 "Su"
         89  74 9  9 "M" 
         99  74 9 10 "Tu"
        150  75 9 11 "W" 
        146  74 9 12 "Th"
        113  42 9 13 "F" 
         38   . 9 14 "Sa"
         66  38 9 15 "Su"
         38  23 9 16 "M" 
         80  50 9 17 "Tu"
         80  34 9 18 "W" 
         99  58 9 19 "Th"
         71  35 9 20 "F" 
         42  24 9 21 "Sa"
         52  27 9 22 "Su"
         33  17 9 23 "M" 
         38  21 9 24 "Tu"
         24  14 9 25 "W" 
         61  32 9 26 "Th"
        108  51 9 27 "F" 
         38  15 9 28 "Sa"
         28  21 9 29 "Su"
          .  18 9 30 "M" 
        end

        Comment

        Working...
        X