how to match three correlated variable into one index

Gatelik Tony

Join Date: Jun 2018

Posts: 8
#1

how to match three correlated variable into one index

20 Jun 2018, 11:40

Dear all,

I am new with stata and wish to ask for help in matching two correlated variables to derive an index/dummy variable which should also be adjusted for either weights or clusters (rural/urban) to account for differences across households. My data is a repeated cross-sectional (randomly stratified).However, the two variables are measured differently i.e. time taken ( t1 & t2) in minutes and mode (m1 & m2) i.e. by car/bus/ or walk all measures distance to reach point A to point B. I created dummies for all variables but from there am stuck on how to derive the main index/dummy adjusted for weights or clusters.

My procedure in arriving to this dummies(walk bus max10min max30min max1hr above2hr)

foreach v of varlist m1 m2{
labrec `v'(1=1 )(2/10=0)
ta `v', g(`v'a)
drop `v'
}

foreach x of varlist t1 t2{
labrec `x'(1=1 )(2=2 ) (3=3 ) (4/10=4 ")
ta `x', g(`x'd)
drop `x'
}
egen bus =anymatch(m1a1 m2a1) if m1a1==1 & m2a1==1, v(1) //for those who reported to have used at least a car/bicycle
egen walk =anymatch( m1a2 m2a2) if m1a2==1 & m2a2==1, v(1) //for those who reported walking

egen max10min =anymatch(t1d1 t2d1) if t1d1==1 & t2d1==1, v(1) //took max 10min
egen max30min =anymatch(t1d2 t2d2) if t1d2==1 & t2d2==1, v(1) //took max 30min
egen max1hr =anymatch(t1d3 t2d3) if t1d3==1 & t2d3==1, v(1) //took max 1 hr
egen above2hr =anymatch(t1d4 t2d4) if t1d4==1 & t2d4==1, v(1) //took about or above 2hrs

drop mode1a1- minutes2d4

* Example generated by -dataex-. To install: ssc install dataex
clear

input double(uniqkey hhid) byte(bus walk max10min max30min max1hr above2hr)
34 11 0 1 1 0 0 0
35 87 0 1 0 0 0 0
96 18 0 0 0 0 0 0
98 5 0 0 1 0 0 0
84 11 1 0 0 0 0 1
85 6 1 0 0 0 0 1
115 29 0 0 0 0 0 1
116 20 1 0 0 0 0 1
117 24 1 0 0 0 1 0
798 45 0 1 0 1 0 0
343 43 1 0 0 0 0 1
344 47 1 0 0 0 0 1
end

Regards,

Gatelik
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

21 Jun 2018, 11:17

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. You'll also do better if you cut your post down to the minimum needed to demonstrate your problem. Also, your example doesn't run - there is no m1 or m2 variable in the data you provided.

Instead of labrec, you might just use recode. I don't understand your egen. It looks redundant. Why not just do a generate with the needed if conditions? Indeed, you could save the first loop and just do the right conditions based on the original variables.
Comment

Gatelik Tony

Join Date: Jun 2018
Posts: 8

23 Jun 2018, 05:28

Thank you for your response Professor Phil Bromiley. sorry for lengthy wording.

I want to use the indicator as an instrument (measuring access to the facility) and am wondering how to create it based on the way the data is coded. var1 which am calling M1 is a dummy (1= closest facility, 0 used at least cars/bicycle) and T1 is a range of time taken to access facility i.e (1 "under 10min" )(2 "11min-1hr" )(3 " 2hrs or more").

My questions is:
- Should I combine this two ( i.e Var1 and T1) to a single indicator and how should I do that based on how they are code?

- I tried interacting both and using the interaction as an instrument alongside the individual variable but the interaction and Var1 first stage are insignificant and interaction term yields different sign i.e positive while the two have a negative signs respectively (No idea how to make sense of it).

ivreg2 depvar (treatment= var1 T1 var1#c.T1) (pw=weights), clusters(village) first

---------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(uniqkey hhid T1) float var1
22431078  22 1 0
22431089  70 1 1
22431113 107 1 1
22431143  30 1 0
22431147  77 1 0
22431299  58 1 0
22431328  98 1 0
22431331  14 2 0
22431393   1 1 1
22431448  17 1 1
22431485   1 1 1
22431559  54 1 1
22431563  31 1 0
22431603  42 1 1
22431608  77 1 1
22431774  54 1 1
22431781  36 1 0
22431805  38 1 1
22431843  39 1 1
22431928  53 1 1
22432027  41 1 0
22432110  86 1 1
22432176   6 1 1
22432235  42 1 1
22432270  69 2 0
22432280  10 1 0
22432391 113 1 1
22432408  64 1 0
22432452  46 1 1
22432471  24 2 1
22432482  35 3 0
22432501  17 1 0
22432528  75 1 0
22432595  25 1 0
22432602  12 1 1
22432635  19 1 0
22432645  61 1 1
22432646  98 1 1
22432648  32 1 1
22432678   1 1 0
22432686   6 1 0
22432699  20 1 0
22432936  23 1 0
22433083  19 1 1
22433149  61 1 0
22433170  32 1 1
22433191  77 2 1
22433212  49 2 1
22433249  28 1 1
22433312  63 1 1
22433422  37 1 1
22433429   3 1 1
22433565  32 1 1
22433570  39 1 1
22433627  32 1 0
22433635  37 1 1
22433713  71 2 1
22433848  62 1 1
22433892  17 2 0
22433933  57 1 1
22433943  38 1 0
22433957   6 1 1
22434043  48 1 1
22434064  54 1 1
22434327  90 1 0
22434344  61 3 0
22434464  38 1 1
22434465  50 1 0
22434476  83 1 0
22434518  94 1 0
22434685  31 1 1
22434711 110 1 1
22434949  78 1 1
22434996  71 1 0
22435013  11 1 0
22435112  75 1 1
22435238  34 . 1
22435414   2 2 1
22435513  50 1 1
22435558  30 1 1
22435636  10 1 1
22435662   1 1 0
22435771  12 1 0
22435820  35 1 0
22435821   4 1 0
22435824  71 1 0
22435892   4 2 0
22436067  46 2 0
22436157  24 1 1
22436395   8 1 1
22436441  32 1 0
end

------------------ copy up to and including the previous line ------------------

I have no idea of how to use "condition with if"

Thank you in advance

Regards,

Gatelik

Announcement

how to match three correlated variable into one index

Comment

Comment