Keeping unique combinations of two variables, irrespective of order

Jacob Gosselin

Join Date: Dec 2021

Posts: 5
#1

Keeping unique combinations of two variables, irrespective of order

10 Jan 2022, 08:50

Hi there!

I'm working with the Census county adjacency file (https://www.census.gov/geographies/r...adjacency.html) and want to keep only unique pairs; i.e. if county 1 and 2 are neighbors of each other, there are two observations that reflect that in the dataset ( (1, 2) and (2, 1) ) and I'd only like to keep one of those two observations. I've been trying to figure out how to do this with unique/distinct/tag(), to no avail.

For a simpler example, here's a short script that draws up a parallel dataset. If you look at the observations, you'll see half of them are "mirrored"; i.e. (B,A) = (1, 10) in row 1 and (10, 1) in row 10. I'd like to drop all "mirrored" observations.

Code:

clear set obs 10 gen A = . gen B = . foreach i of numlist 1/10 { replace A = `i' if _n == `i' replace B = 11 - `i' if _n == `i' } sort B A

Thanks!
Jake
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35699
#2

10 Jan 2022, 09:06

https://www.stata-journal.com/articl...article=dm0043
1 like
Comment
Jacob Gosselin

Join Date: Dec 2021

Posts: 5
#3

10 Jan 2022, 09:15

Thank you for posting the journal article, that's perfect (apologies for not finding it earlier in my preliminary googling!)

Last edited by Jacob Gosselin; 10 Jan 2022, 09:18.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35699
#4

10 Jan 2022, 09:21

Keywords can be too general or too specific. I was looking for stuff on a certain topic today and got a quarter of a million hits.
1 like
Comment

Announcement

Keeping unique combinations of two variables, irrespective of order

Comment

Comment

Comment