Hello everyone! For my thesis I need to match observations based on an index variable that measures home conditions, personal variables such as age, gender, education, etc. and year. My home index variable is numerical (from 0 to 103) and the personal characteristics are either dummies or categorical variables. For my analysis I need to match the most similar observations based on these variables. It is sort of a nearest neighbor match but without havinf a control or treatment group.
The dataset looks something like this.
indice_hogar anio mes directorio orden mujer nivel__educativo_cat trabaja
0 2018 08 4700731 1 1 4 1
0 2018 08 4700731 2 0 5 1
0 2018 11 4777752 1 0 5 1
37 2018 04 4605803 1 0 3 1
42 2011 07 2735691 1 1 4 1
42 2018 02 4545459 1 0 3 1
43 2018 12 4803694 1 0 5 1
44 2018 10 4747974 1 0 5 1
46 2018 05 4610096 1 0 3 1
47 2018 04 4598828 1 1 1 0
47 2018 08 4687722 1 0 1 0
48 2018 04 4592941 1 0 5 0
48 2018 06 4636177 1 0 3 1
50 2018 06 4645892 1 0 1 1
50 2018 06 4645892 2 1 4 1
For better understanding, I am using an IV that is the ability of the most similar person according to the index and to personal characteristics. Which means I need to find the most similar observation to, for example, person A and then be able to take its match's abilities and use it for a regression. If anyone knows how to do this it would help a lot
The dataset looks something like this.
indice_hogar anio mes directorio orden mujer nivel__educativo_cat trabaja
0 2018 08 4700731 1 1 4 1
0 2018 08 4700731 2 0 5 1
0 2018 11 4777752 1 0 5 1
37 2018 04 4605803 1 0 3 1
42 2011 07 2735691 1 1 4 1
42 2018 02 4545459 1 0 3 1
43 2018 12 4803694 1 0 5 1
44 2018 10 4747974 1 0 5 1
46 2018 05 4610096 1 0 3 1
47 2018 04 4598828 1 1 1 0
47 2018 08 4687722 1 0 1 0
48 2018 04 4592941 1 0 5 0
48 2018 06 4636177 1 0 3 1
50 2018 06 4645892 1 0 1 1
50 2018 06 4645892 2 1 4 1
For better understanding, I am using an IV that is the ability of the most similar person according to the index and to personal characteristics. Which means I need to find the most similar observation to, for example, person A and then be able to take its match's abilities and use it for a regression. If anyone knows how to do this it would help a lot

Comment