Thanks for Kit Baum, a new program called rangejoin is now available from SSC. Stata 11 is required.
rangejoin forms pairwise combinations between observations in memory and observations in a using dataset when the value of a key variable in the using dataset is within the range specified by observations in the data in memory.
To install, type in Stata's command window:
Once installed, type
to get more information.
rangejoin leverages the power of the recently announced rangestat (from SSC, with Roberto Ferrer and Nick Cox) to implement what amounts to a joinby over a range of values. The observations in memory define the low and high bounds of the interval to use and the values of a key variable in the using dataset determines which observations are to be paired.
Without rangejoin, you would have to follow these steps to achieve the same results:
rangejoin can be used with a stored copy of the data in memory to form the same pairwise combinations that rangestat makes to calculates its statistics. This could be useful if you need to impose an additional condition that can't be specified with rangestat.
You can also do things that you could not do with rangestat, that is you can form pairwise combinations over completely different datasets. Suppose that you have data on medical events:
and you have in memory data on prescription fills:
You can match each prescription with medical events that occurred during the fill period:
and the results:
You can reorder the observations and show, for each medical event, which prescriptions were being taken at the time of the event:
rangejoin forms pairwise combinations between observations in memory and observations in a using dataset when the value of a key variable in the using dataset is within the range specified by observations in the data in memory.
To install, type in Stata's command window:
Code:
ssc install rangejoin
Code:
help rangejoin
rangejoin leverages the power of the recently announced rangestat (from SSC, with Roberto Ferrer and Nick Cox) to implement what amounts to a joinby over a range of values. The observations in memory define the low and high bounds of the interval to use and the values of a key variable in the using dataset determines which observations are to be paired.
Without rangejoin, you would have to follow these steps to achieve the same results:
- rename variables in memory or in the using dataset to avoid name conflicts, and
- use cross or joinby to form all pairwise combinations, and
- use keep if inrange(keyvar, low, high) to reduce the data to pairwise combinations that are within the desired range.
rangejoin can be used with a stored copy of the data in memory to form the same pairwise combinations that rangestat makes to calculates its statistics. This could be useful if you need to impose an additional condition that can't be specified with rangestat.
You can also do things that you could not do with rangestat, that is you can form pairwise combinations over completely different datasets. Suppose that you have data on medical events:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(patient event_id event_type) str10 event_date float eventday 1 1 345 "01/02/2015" 20090 1 2 543 "01/02/2015" 20090 1 3 676 "01/20/2015" 20108 2 1 567 "01/10/2015" 20098 2 2 999 "01/20/2015" 20108 end format %td eventday save "events.dta", replace
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(patient Rx_id) str10 fill_date float(supply drug_id fillday) 1 1 "01/01/2015" 30 88 20089 1 2 "01/01/2015" 90 99 20089 1 3 "01/15/2015" 15 77 20103 1 4 "01/17/2015" 21 33 20105 2 1 "01/01/2015" 15 22 20089 2 2 "01/07/2015" 30 44 20095 end format %td fillday
Code:
* the day of the last pill gen lastday = fillday + supply - 1 format %td lastday rangejoin eventday fillday lastday using "events.dta", by(patient) list patient Rx_id drug_id fillday lastday event_date event_type , sepby(patient Rx_id)
Code:
. list patient Rx_id drug_id fillday lastday event_date event_type , sepby(patient Rx_id) +---------------------------------------------------------------------------+ | patient Rx_id drug_id fillday lastday event_date event~pe | |---------------------------------------------------------------------------| 1. | 1 1 88 01jan2015 30jan2015 01/02/2015 345 | 2. | 1 1 88 01jan2015 30jan2015 01/02/2015 543 | 3. | 1 1 88 01jan2015 30jan2015 01/20/2015 676 | |---------------------------------------------------------------------------| 4. | 1 2 99 01jan2015 31mar2015 01/02/2015 345 | 5. | 1 2 99 01jan2015 31mar2015 01/02/2015 543 | 6. | 1 2 99 01jan2015 31mar2015 01/20/2015 676 | |---------------------------------------------------------------------------| 7. | 1 3 77 15jan2015 29jan2015 01/20/2015 676 | |---------------------------------------------------------------------------| 8. | 1 4 33 17jan2015 06feb2015 01/20/2015 676 | |---------------------------------------------------------------------------| 9. | 2 1 22 01jan2015 15jan2015 01/10/2015 567 | |---------------------------------------------------------------------------| 10. | 2 2 44 07jan2015 05feb2015 01/10/2015 567 | 11. | 2 2 44 07jan2015 05feb2015 01/20/2015 999 | +---------------------------------------------------------------------------+
Code:
. sort patient event_id Rx_id . list patient event_id eventday event_type Rx_id drug_id fillday lastday, sepby(patient event_id) +-------------------------------------------------------------------------------------+ | patient event_id eventday event~pe Rx_id drug_id fillday lastday | |-------------------------------------------------------------------------------------| 1. | 1 1 02jan2015 345 1 88 01jan2015 30jan2015 | 2. | 1 1 02jan2015 345 2 99 01jan2015 31mar2015 | |-------------------------------------------------------------------------------------| 3. | 1 2 02jan2015 543 1 88 01jan2015 30jan2015 | 4. | 1 2 02jan2015 543 2 99 01jan2015 31mar2015 | |-------------------------------------------------------------------------------------| 5. | 1 3 20jan2015 676 1 88 01jan2015 30jan2015 | 6. | 1 3 20jan2015 676 2 99 01jan2015 31mar2015 | 7. | 1 3 20jan2015 676 3 77 15jan2015 29jan2015 | 8. | 1 3 20jan2015 676 4 33 17jan2015 06feb2015 | |-------------------------------------------------------------------------------------| 9. | 2 1 10jan2015 567 1 22 01jan2015 15jan2015 | 10. | 2 1 10jan2015 567 2 44 07jan2015 05feb2015 | |-------------------------------------------------------------------------------------| 11. | 2 2 20jan2015 999 2 44 07jan2015 05feb2015 | +-------------------------------------------------------------------------------------+
Comment