Create Group ID Based on Multiple Variables Identifying the IDs

Colton Tousey

Join Date: Aug 2016
Posts: 11

Create Group ID Based on Multiple Variables Identifying the IDs

05 Apr 2022, 08:37

I am currently attempting to create a group id based on the results of psmatch with 3 nearest neighbor matches.

Simplified version of my data:

Code:

clear
input float(id freq n1 n2 n3 treated)
1 1 2 4 5 1
2 2 . . . 0
2 2 . . . 0
3 1 6 2 7 1
4 1 . . . 0
5 1 . . . 0
6 1 . . . 0
7 1 . . . 0
end

The freq variable is because we currently are running this with replacement, so I have used expand for to create duplicates of those control observations that are used multiple times. n1, n2, n3 all represent the ids of control observations that were matched to the treated id.

I am currently stuck on how to possibly group these observations together. Ideally, I am looking for something that would look like:

Code:

     +----------------------------------------------+
     | id   freq   n1   n2   n3   treated   groupid |
     |----------------------------------------------|
  1. |  1      1    2    4    5         1         1 |
  2. |  2      2    .    .    .         0         1 |
  3. |  2      2    .    .    .         0         2 |
  4. |  3      1    6    2    7         1         2 |
  5. |  4      1    .    .    .         0         1 |
     |----------------------------------------------|
  6. |  5      1    .    .    .         0         1 |
  7. |  6      1    .    .    .         0         2 |
  8. |  7      1    .    .    .         0         2 |
     +----------------------------------------------+

If it makes the solution any easier, we are also considering running this match without replacement, so we wouldn't have duplicated ids used.

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#2

05 Apr 2022, 09:31

As it turns out, -expand-ing the observations the way you did gets in the way of the approach that seems most natural to me. So the code undoes this along the way, with the extra observations ultimately created by -merge-.

Code:

preserve keep if treated keep id n1 n2 n3 gen long group = _n rename id n0 reshape long n, i(group) drop _j rename n id tempfile groups save `groups' restore duplicates drop merge 1:m id using `groups', nogenerate sort group id
Comment
Colton Tousey

Join Date: Aug 2016

Posts: 11
#3

05 Apr 2022, 14:14

Thank you, Clyde! Such a relatively simple solution for what I was racking my brain for hours on. This made our life much easier!
Comment

Announcement

Create Group ID Based on Multiple Variables Identifying the IDs

Comment

Comment