Listing the var name in a specified field rather than the value - Syntax help?

Helen Kidane

Join Date: Feb 2022

Posts: 5
#1

Listing the var name in a specified field rather than the value - Syntax help?

10 Feb 2022, 16:32

Hello fellow Stata Users! I am a new user and am in need of some help.

I have several race variables (Black, White, Asian...) in Boolean format containing a 1 for yes and a . for no. Some records have 1's for multiple race categories (for example, someone identifying as both Black and Asian). I need help writing syntax that would take the 1's and list those race categories in the records "RaceDisplay" variable.

So in this example, In the "RaceDisplay" variable, "Black Asian" would be listed for that record.

Any help would be appreciated. Thank you!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

10 Feb 2022, 16:53

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(Black White Asian) 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 end gen RaceDisplay = "" foreach v of varlist Black White Asian { replace RaceDisplay = RaceDisplay + "`v' " if `v' == 1 } replace RaceDisplay = trim(RaceDisplay) list, noobs clean

In the future, please show example data when asking for help, and please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
1 like
Comment
Helen Kidane

Join Date: Feb 2022

Posts: 5
#3

11 Feb 2022, 13:04

Clyde - much obliged! I was not aware of -dataex-. I attempted to upload a screenshot without much success, so resorted to the description above. Will be sure to use the -dataex- resource in the future, thank you for sharing about that resource.

A follow up question, your example syntax worked, however how do I insert a delimiter to space the values apart? Currently in the RaceDisplay field, it displays as BlackAsian rather than Black Asian. Thank you, once again.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#4

11 Feb 2022, 13:22

There is an extra space deliberately in Clyde’s code for this purpose.
Comment
Helen Kidane

Join Date: Feb 2022

Posts: 5
#5

11 Feb 2022, 13:27

Ah! I see that now! Thanks Nick!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

11 Feb 2022, 19:40

Here's another solution, although I would much prefer @Clyde Schechter's approach for say 7 or 70 categories!

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(Black White Asian)
0 1 0
1 0 0
0 1 0
0 0 0
0 1 0
0 0 0
0 0 0
0 1 1
1 1 0
0 0 0
0 0 0
1 1 0
0 0 1
0 0 0
1 1 0
0 1 0
1 1 0
1 0 0
0 0 0
0 0 0
end

gen RaceDisplay = trim(itrim(Black * "Black" + " " + White * "White" + " " + Asian * "Asian")) 

tab RaceDisplay 

RaceDisplay |      Freq.     Percent        Cum.
------------+-----------------------------------
      Asian |          1        8.33        8.33
      Black |          2       16.67       25.00
Black White |          4       33.33       58.33
      White |          4       33.33       91.67
White Asian |          1        8.33      100.00
------------+-----------------------------------
      Total |         12      100.00

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35698
#7

12 Feb 2022, 04:39

[A correction itself corrected...]
Comment
Sonnen Blume

Join Date: Aug 2018

Posts: 342
#8

12 Feb 2022, 08:50

Originally posted by Nick Cox View Post

Here's another solution, although I would much prefer @Clyde Schechter's approach for say 7 or 70 categories!

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(Black White Asian) 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 end gen RaceDisplay = trim(itrim(Black * "Black" + " " + White * "White" + " " + Asian * "Asian")) tab RaceDisplay RaceDisplay | Freq. Percent Cum. ------------+----------------------------------- Asian | 1 8.33 8.33 Black | 2 16.67 25.00 Black White | 4 33.33 58.33 White | 4 33.33 91.67 White Asian | 1 8.33 100.00 ------------+----------------------------------- Total | 12 100.00

Thank you for this neat solution. Is it something similar to what 'grouplabs' and 'concat' functions could do? I remember making a similar post once and someone suggested the mentioned methods, but this one looks more convenient. I would like to know if this method can applied for likert variables as well, e.g. how often Blacks, Whites, Asians visit England (always, often, sometimes, never), and the table will show the frequency of saying always, often, sometimes, never by the race categories. I am not sure this example was clear but there are surveys that contain such scenarios: how often feels blue/happy/lonely (never/often/always).
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#9

12 Feb 2022, 09:03

I don't know anything about grouplabs. The concat() function of the egen command could here only concatenate 0s and 1s.

If there were very many indicator variables with values 0 and 1 you could do it this way.

Code:

gen racedisplay = "" foreach v of varlist black white asian { replace racedispay = racedisplay + v * "`v' " } replace racedisplay = itrim(trim(racedisplay))

which is closer to Clyde Schechter's code in #2. The code depends on 0*"text" returning an empty string and 1*"text" echoing "text".
Comment

Bjarte Aagnes

Join Date: Apr 2014
Posts: 784

#10

12 Feb 2022, 10:47

#6 can be written like:

Code:

foreach v of varlist Black White Asian {

  local code `code' `concatenate' ( `v' * "`v'" )
  local concatenate + char(32) +
}

gen RaceDisplay2 = trim(itrim(`code'))

and for many variables #6 will be faster than #2 and #9

Code:

. qui forvalues i = 1/10 {
..........
 
. di _N " " c(k)

1000000 215
. 
. timer list
   #2:   1855.92 /       10 =     185.5917
   #6:    593.57 /       10 =      59.3565

Announcement

Listing the var name in a specified field rather than the value - Syntax help?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment