Dear community, I am currently dealing with a somewhat problematic dataset (at least for a beginner like me).
So there are many dichotomous variables in this dataset. I'll give you two examples: Example 1: The question in the questionnaire was: "Which rooms do you use?" and this is what I see in the dataset: V1: Kitchen (1: quoted, 0: not quoted) V2: WC (1: quoted, 0: not quoted) V3: living room (1: quoted, 0: not quoted) V4: corridor (1: quoted, 0: not quoted). Now I want to make out of these 4 variables (V1-V4) one categorical variable (let's call it V5 "used rooms") with the values 1 = kitchen, 2 = WC, 3 = living room and 4 = hallway. So that every "1" in one variable is a category in the new variable V5.
So my first question is: With which command is it possible to merge these variables in the way I described above? Example 2: They asked in the questionnaire about the income in categories: "What is your daily wage?" V6: more than 150 (1: quoted, 0: not quoted) V7: 101-150 (1: quoted, 0: not quoted) V8: 51-100 (1: quoted, 0: not quoted) V9: 1-50 (1: quoted, 0: not quoted) V10: no payment (1: quoted, 0: not quoted) Now I need to combine these 5 variables (V6-V10) into one variable (V11: "Daily wage") in the same way like the example above. But now there is the problem that there are cases where people have selected multiple variables to this question. So my second question is: Is there a command to tell STATA, for example, to select the upper variable selected from the person and discard all the others (with the theoretical assumption that the first statement is closest to the true value)? So for example someone who choose "more than 150" and "101-150", so we assume that the "more than 150" is closer to the true value. Does anyone have a solution to this dilemma or can help me somehow? I am happy about any considerations! THANK YOU!!!
Comment