Thanks as usual to Kit Baum, a new package designplot is now available from SSC. Stata 8.2 is required.
The name of the program may mean little or nothing to people. What's a design plot? The problem bites backwards more than forwards. Sometimes simple plots don't really need names in your papers and presentations: you just write or say "plotting something versus something else". It's almost an accident if a plot has a standard name (histogram, scatter plot, box plot) and often standard names are less than standard (what's a dot plot, and do you call it something else?). But a programmer writing a program must give it a name, and I chose designplot because "design plot" is a name in the literature. However, the design plots in the literature don't bear very much resemblance to the results of designplot.
But let's curtail that dogged discussion (there's more in the help for those so inclined).
Here's an example straight away:

The main idea is
1. You name a response and at least one predictor.
2. The graph shows summarize results for the response given the distinct levels of the predictors and their cross-combinations.
3. The default is just the mean, but one or more results can be shown.
4. If you name (say) two predictors, you get the zero-way breakdown (no breakdown at all), both one-way breakdowns for each predictor and the two-way breakdown for both predictors combined. (You are asked to swallow the non-standard term "zero-way" as a modest extension of standard terminology.)
5. You can get less than #4 by restricting, e.g., to just the one-way breakdowns, or at most the one-way breakdowns.
6. graph dot is used by default, but you can invoke graph hbar (which often works well) or graph bar (which less often works well).
7. You can save the results graphed as a new dataset. This may help in tabulation or in preparing a new graph.
This works somewhat like the existing (and apparently rather neglected) grmeanby command and also a lot like graph dot used directly. But there are different twists. Otherwise the command would be pointless.
#7 is different over either. The scope for multiscale breakdowns is new over either. grmeanby is restricted to means or medians (although any competent user-programmer could clone it quickly to do otherwise).
Here is another simple example. We will look at means and medians, sort within groups on means, add variable labels and restrict scope to zero- and one-way breakdowns.

I would want to use the Graph Editor to tweak that, notably to tweak "Repair Record 1978" to two lines to take up less space, but that's always the sort of detail you want to improve.
Here is a variant on a common problem often tackled with tables. People are often interested in seeing various univariate breakdowns of frequencies for categorical variables. (To get percents, save the results as a dataset, do a simple calculation and call up graph again.)

One more example, assuming you're still reading. Looking at (one version of) the Titanic data, the focus is in variations of fraction survived as a response to age, sex, class and their interactions. The code is in the help file.

This kind of graph can be useful for description or exploration and perhaps even give you ideas about whether your models need interaction terms.
The name of the program may mean little or nothing to people. What's a design plot? The problem bites backwards more than forwards. Sometimes simple plots don't really need names in your papers and presentations: you just write or say "plotting something versus something else". It's almost an accident if a plot has a standard name (histogram, scatter plot, box plot) and often standard names are less than standard (what's a dot plot, and do you call it something else?). But a programmer writing a program must give it a name, and I chose designplot because "design plot" is a name in the literature. However, the design plots in the literature don't bear very much resemblance to the results of designplot.
But let's curtail that dogged discussion (there's more in the help for those so inclined).
Here's an example straight away:
Code:
sysuse auto set scheme s1color designplot mpg foreign rep78
The main idea is
1. You name a response and at least one predictor.
2. The graph shows summarize results for the response given the distinct levels of the predictors and their cross-combinations.
3. The default is just the mean, but one or more results can be shown.
4. If you name (say) two predictors, you get the zero-way breakdown (no breakdown at all), both one-way breakdowns for each predictor and the two-way breakdown for both predictors combined. (You are asked to swallow the non-standard term "zero-way" as a modest extension of standard terminology.)
5. You can get less than #4 by restricting, e.g., to just the one-way breakdowns, or at most the one-way breakdowns.
6. graph dot is used by default, but you can invoke graph hbar (which often works well) or graph bar (which less often works well).
7. You can save the results graphed as a new dataset. This may help in tabulation or in preparing a new graph.
This works somewhat like the existing (and apparently rather neglected) grmeanby command and also a lot like graph dot used directly. But there are different twists. Otherwise the command would be pointless.
#7 is different over either. The scope for multiscale breakdowns is new over either. grmeanby is restricted to means or medians (although any competent user-programmer could clone it quickly to do otherwise).
Here is another simple example. We will look at means and medians, sort within groups on means, add variable labels and restrict scope to zero- and one-way breakdowns.
Code:
designplot mpg foreign rep78, stat(median mean) variablelabels maxway(1) entryopts(sort(2) descending)
I would want to use the Graph Editor to tweak that, notably to tweak "Repair Record 1978" to two lines to take up less space, but that's always the sort of detail you want to improve.
Here is a variant on a common problem often tackled with tables. People are often interested in seeing various univariate breakdowns of frequencies for categorical variables. (To get percents, save the results as a dataset, do a simple calculation and call up graph again.)
Code:
designplot mpg foreign rep78 if !missing(foreign,rep78), stat(count) recast(hbar) blabel(total) yla(none) t1title("frequencies") variablelabels ytitle("") ysc(r(0 72))
One more example, assuming you're still reading. Looking at (one version of) the Titanic data, the focus is in variations of fraction survived as a response to age, sex, class and their interactions. The code is in the help file.
This kind of graph can be useful for description or exploration and perhaps even give you ideas about whether your models need interaction terms.
Comment