A student approached me and noticed that when I showed two different ways to generate a new dummy variable, the first yielded a message that made sense while the second had a message that didn't seem to make sense. Here are the frequencies for the variable we converted to a dummy.
veteran | Freq.
------------+----------
1 | 4
3 | 1
5 | 9
6 | 101
. | 20
------------+------------
Total | 135 100.00
Here are the two methods I demonstrated to create a dummy variable for veteran (1=yes/0=no). Both methods "worked" such that they yielded the dummy variable as intended.
* Method 1
ge vetcat1 = .
replace vetcat1 = 1 if veteran==1 | veteran==3 | veteran==5
replace vetcat1 = 0 if veteran==6 * Method 2
ge vetcat2 = veteran
recode vetcat2 1/5=1 6=0
The first method gave this output. . ge vetcat1 = .
(135 missing values generated)
. replace vetcat1 = 1 if veteran==1 | veteran==3 | veteran==5
(14 real changes made)
. replace vetcat1 = 0 if veteran==6
(101 real changes made)
The second method gave this output.
. ge vetcat2 = veteran
(20 missing values generated)
. recode vetcat2 1/5=1 6=0
(vetcat2: 111 changes made)
So the question is why does method 1 display 14 and 101 changes made (which makes sense) while method 2 displays 111 changes made (which is not the sum 14 + 101)?
I couldn't figure it out. Have you run across this?
veteran | Freq.
------------+----------
1 | 4
3 | 1
5 | 9
6 | 101
. | 20
------------+------------
Total | 135 100.00
Here are the two methods I demonstrated to create a dummy variable for veteran (1=yes/0=no). Both methods "worked" such that they yielded the dummy variable as intended.
* Method 1
ge vetcat1 = .
replace vetcat1 = 1 if veteran==1 | veteran==3 | veteran==5
replace vetcat1 = 0 if veteran==6 * Method 2
ge vetcat2 = veteran
recode vetcat2 1/5=1 6=0
The first method gave this output. . ge vetcat1 = .
(135 missing values generated)
. replace vetcat1 = 1 if veteran==1 | veteran==3 | veteran==5
(14 real changes made)
. replace vetcat1 = 0 if veteran==6
(101 real changes made)
The second method gave this output.
. ge vetcat2 = veteran
(20 missing values generated)
. recode vetcat2 1/5=1 6=0
(vetcat2: 111 changes made)
So the question is why does method 1 display 14 and 101 changes made (which makes sense) while method 2 displays 111 changes made (which is not the sum 14 + 101)?
I couldn't figure it out. Have you run across this?
Comment