Thanks to Kit Baum, a new version of rangestat (with Nick Cox and Roberto Ferrer) is now available on SSC. Stata 11 is required.
To update to the new version, type in Stata's Command window
For a first install, type
Once installed/updated, type
to get more information.
rangestat calculates statistics for each observation using observations that are in range of the current observation in terms of a numeric key variable and the low and high bounds defined for the current observation. With rangestat, you can do calculations using a degenerate interval (where low == high), a rolling window, a recursive window (where the first period is fixed), a reversed recursive window (where the last period is fixed), or with observation-specific windows, each independently specified using interval bound variables.
This version fixes a problem in the previous version when an interval bound was computed by adding an offset to the value of the key variable and the computed bound could not be stored in a variable of the key variable's data type. This was most likely to bite when the key variable was a byte. The overflow would be treated as a missing bound and affected observations were excluded from the sample. Many thanks to Clyde Schechter for bringing this to our attention.
Missing interval bounds are now allowed and handled using the rules that Stata uses for its inrange() function: if the lower bound is missing, observations will match up to and including the value of the higher bound. If both low and high bounds are missing, all observations will match.
Also new, you can skip calculating statistics for any observation by using invalid bounds (low > high) without removing the observation from the sample that can be selected to calculate statistics for other observations.
There are new built-in statistics:
The help file features a new way to present examples that can be easily tried via a click to run link. Each example is presented in a code block and is run as a do-file. The code that runs is exactly what is shown in the help file. Long lines are split using the /// line continuation indicator.
Finally, rangestat now scans for cases where multiple observations use the same interval bounds (they each get the same results since the observations in range are the same) and will only calculate results for one observation and carry over results to repeat observations with the same bounds. Naturally, this optimization cannot be done if rangestat is to exclude the value from the current observation (the excludeself option is specified).
To update to the new version, type in Stata's Command window
Code:
adoupdate rangestat
Code:
ssc install rangestat
Code:
help rangestat
rangestat calculates statistics for each observation using observations that are in range of the current observation in terms of a numeric key variable and the low and high bounds defined for the current observation. With rangestat, you can do calculations using a degenerate interval (where low == high), a rolling window, a recursive window (where the first period is fixed), a reversed recursive window (where the last period is fixed), or with observation-specific windows, each independently specified using interval bound variables.
This version fixes a problem in the previous version when an interval bound was computed by adding an offset to the value of the key variable and the computed bound could not be stored in a variable of the key variable's data type. This was most likely to bite when the key variable was a byte. The overflow would be treated as a missing bound and affected observations were excluded from the sample. Many thanks to Clyde Schechter for bringing this to our attention.
Missing interval bounds are now allowed and handled using the rules that Stata uses for its inrange() function: if the lower bound is missing, observations will match up to and including the value of the higher bound. If both low and high bounds are missing, all observations will match.
Also new, you can skip calculating statistics for any observation by using invalid bounds (low > high) without removing the observation from the sample that can be selected to calculate statistics for other observations.
There are new built-in statistics:
- skewness
- kurtosis
- correlation, first and second variables
- covariance, first and second variables
- ordinary least squares regression with a constant
The help file features a new way to present examples that can be easily tried via a click to run link. Each example is presented in a code block and is run as a do-file. The code that runs is exactly what is shown in the help file. Long lines are split using the /// line continuation indicator.
Finally, rangestat now scans for cases where multiple observations use the same interval bounds (they each get the same results since the observations in range are the same) and will only calculate results for one observation and carry over results to repeat observations with the same bounds. Naturally, this optimization cannot be done if rangestat is to exclude the value from the current observation (the excludeself option is specified).
Comment