Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -fractileplot- revised on SSC

    Thanks as always to Kit Baum, the package fractileplot has been revised on SSC. It consists of a single command with the same name (together with the previous version).

    fractileplot was first (indeed last) announced on Statalist in 2006. https://www.stata.com/statalist/arch.../msg00395.html

    The term fractile was introduced into the English statistical literature by Anders Hald in 1952 and was a translation of the Danish word fraktil. Some readers here may be able to cast light on the history of that word in Danish both generally and in its statistical sense.

    fractile corresponds to quantile, which in my reading is much more common in statistical literature; the terms have now the same threefold meaning, as referring to (1) particular summary measures based on ordered values, (2) bins that such measures delimit, and (3) all the values in a batch, when ordered by magnitude. Both have occasionally been confused with the corresponding cumulative probabilities.

    The existence of a different name is convenient if only to distinguish this command from various commands naming quantiles directly or indirectly.

    The essence of fractileplot is smoothing an outcome with respect to various predictors treated simultaneously, each measured on the scale of its distribution function (in essence, its ranks rescaled to the unit interval). Its intent is, and is only, as a descriptive or exploratory tool to expose dependence structure or its lack, as a prelude usually to more formal modelling.

    Although this command has not been heavily used, or even used by anyone apart from myself, the original version of fractileplot is also included in the package as fractileplot1, an inversion of the common practice whereby (e.g.) foo2 supersedes foo. This is just in case anyone wants to reproduce any results with the original command, or exceptionally someone is using Stata 8 or 9 and so cannot use fractileplot, which requires Stata 10 at least.

    The main change in the command is that smoothing is done with lpoly, which was added to Stata in Stata 10. That command is much more versatile than lowess, which was the previous workhorse, at least by default.

    The help file has been rewritten. I document a kinship with a particular kind of plot used by A.R. Wallace (Darwin's contemporary) in 1889, R.H. Lock in 1906 and R.A. Fisher (no less) in all editions of his Statistical Methods for Research Workers from 1925.

    Here is a token example to give flavour. The example underlines that (with a little caution as always) using binary predictors as well as counted or measured predictors has its own logic. Here, and generally, a binary predictor has mean ranks that always differ by 0.5 when mapped to the unit interval. Smoothing w.r.t. cumulative probability does not stand in the way of later modelling in terms of the original scale of a predictor or some transformed scale.

    Much more is said in the help.

    Code:
    sysuse auto, clear
    fractileplot mpg weight foreign

    Click image for larger version

Name:	fractileplot.png
Views:	1
Size:	57.4 KB
ID:	1737986
Working...
X