Dear Statalist,
I am looking for a way to merge two variable labels into one.
I have two variables, var1 and var2, with corresponding value labels (lab1 and lab2).
- var1 ranges from 1 to 100, and lab1 has 100 different labels for each value ("a" for 1, "b" for 2, ..., "zzz" for 100).
- var2 ranges from 101 to 200, and lab2 has 100 different labels for each value ("aa" for 101, "bb" for 102, ..., "zzzz" for 200).
- For certain values of var1, var2 is non-missing. Otherwise, var2 is missing.
(If you are familiar with a survey, think of var1 as a regular response variable including "other", and of var2 as additional response variable conditional upon selecting "other" in var1)
Here's an example of 5 observations in my data. var2 is non-missing if var1 is 29 or 30, and missing otherwise.
(There are several other values in var1 which var2 is non-missing, but var2 is missing in most values of var1)
My goal is to combine var1 and var2 into one variable (var3)
The combined variable has the value from var2 if var2 is non-missing, or from var1 if var2 is missing. Thus it will vary from 1 to 200. This is pretty easy.
But I am struggling with combining lab1 and lab2 into one value label (lab3) ranging from 1 to 200 for this new combined variable.
I first thought it would be easy because lab1 and lab2 do not have any overlapping values, but I could not figure it out how to do it.
I first crated lab3, a copy of lab1
The problem is that I cannot simply add value labels from lab2; I tried the following command, but it returned the syntax error (r198)
It seems "lab define, add" requires me to manually write each value labels, but it would be tedious to enter 100 different values in lab2.
Is there a way to combine two value labels, with no overlapping values, into one value label?
// Edited
I checked Nick's method of generating composite categorical variable with label on Stata Journal (https://www.stata-journal.com/articl...article=dm0034), but it does not work well in my case - it generates so many missing values in the composite variable due to missing values in var2. I also think there's easier way to do this.
//
Thank you.
I am looking for a way to merge two variable labels into one.
I have two variables, var1 and var2, with corresponding value labels (lab1 and lab2).
- var1 ranges from 1 to 100, and lab1 has 100 different labels for each value ("a" for 1, "b" for 2, ..., "zzz" for 100).
- var2 ranges from 101 to 200, and lab2 has 100 different labels for each value ("aa" for 101, "bb" for 102, ..., "zzzz" for 200).
- For certain values of var1, var2 is non-missing. Otherwise, var2 is missing.
(If you are familiar with a survey, think of var1 as a regular response variable including "other", and of var2 as additional response variable conditional upon selecting "other" in var1)
Here's an example of 5 observations in my data. var2 is non-missing if var1 is 29 or 30, and missing otherwise.
(There are several other values in var1 which var2 is non-missing, but var2 is missing in most values of var1)
var1 | var2 |
26 | (missing) |
27 | (missing) |
28 | (missing) |
29 | 130 |
30 | 122 |
The combined variable has the value from var2 if var2 is non-missing, or from var1 if var2 is missing. Thus it will vary from 1 to 200. This is pretty easy.
Code:
gen var3 = var2 replace var3 = var1 if missing(var2)
I first thought it would be easy because lab1 and lab2 do not have any overlapping values, but I could not figure it out how to do it.
I first crated lab3, a copy of lab1
Code:
lab copy lab1 lab3
Code:
lab define lab3 lab2, add
Is there a way to combine two value labels, with no overlapping values, into one value label?
// Edited
I checked Nick's method of generating composite categorical variable with label on Stata Journal (https://www.stata-journal.com/articl...article=dm0034), but it does not work well in my case - it generates so many missing values in the composite variable due to missing values in var2. I also think there's easier way to do this.
//
Thank you.
Comment