Say that I have a dataset where some observations have x==0 and the others have x==1. (If it is easier, we can consider these as two distinct datasets.) There are several binary variables, b1, b2, b3, and b4, and two continuous variables, c1, and c2. What is a command that I can use to identify one similar x==0 observation for each x==1 observation (and do so without replacement)? I am flexible with how I define "similar," but it will be based on how close the six binary and continuous variables are. If necessary, I can consider only one of the two continuous variables (c1) and commit to prioritizing similarity in c1 over similarity in the binary variables.
-
Login or Register
- Log in with
Comment