Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -elabel- updated on SSC

    Thanks to Kit Baum, an updated version of the elabel commands is now available from the SSC. In Stata, type

    Code:
    ssc install elabel , replace
    to install the latest version.

    I have first announced elabel here, where I have also shown some examples. In this post, I will show examples of three new features of elabel.

    One of the new features of elabel borrows from a sometimes overlooked syntax element of new variable lists that is used with, e.g., generate:

    Code:
    generate newvar:lblname =exp
    where generate lets you create a new variable and, simultaneously, attach a value label to it. elabel does not create new variables, it defines (or modifies) value labels and now allows you to, simultaneously, attach these value labels to variables using the same (slightly extended) syntax. Here is an example

    Code:
    . sysuse nlsw88 , clear
    (NLSW, 1988 extract)
    
    . describe south-c_city
    
                  storage   display    value
    variable name   type    format     label      variable label
    ------------------------------------------------------------------------------
    south           byte    %8.0g                 lives in south
    smsa            byte    %9.0g      smsalbl    lives in SMSA
    c_city          byte    %8.0g                 lives in central city
    
    . elabel define (south-c_city):yesno 0 "no" 1 "yes"
    
    . describe south-c_city
    
                  storage   display    value
    variable name   type    format     label      variable label
    ------------------------------------------------------------------------------
    south           byte    %8.0g      yesno      lives in south
    smsa            byte    %9.0g      yesno      lives in SMSA
    c_city          byte    %8.0g      yesno      lives in central city
    
    . label list yesno
    yesno:
               0 no
               1 yes
    In the example above, I have defined one new value label, yesno, and attached it to three variables. To do this, I have put the variable names, followed by a colon, before the value label name; I have used parentheses to refer to more than one variable.

    Suppose, I later decide that I want each variable to have its own value label attached. A second feature of elabel borrows from egen's syntax and allows (pseudo-) functions to define (or modify) value labels. I will use a (pseudo-)function to make multiple copies of the value label yesno using the names of the variables that have yesno attached. To get a list of all variables that have a value label yesno attached, I will use ds

    Code:
    . ds , has(vallabel yesno)
    south   smsa    c_city
    
    . local varlist `r(varlist)'
    
    . elabel define `varlist' = copy(yesno)
    
    . label list `varlist'
    south:
               0 no
               1 yes
    smsa:
               0 no
               1 yes
    c_city:
               0 no
               1 yes
    In a second step, I will now attach these new value labels to the same-named variables. I will do this using the new extended syntax of elabel values

    Code:
    . elabel values (`varlist') (`varlist')
    
    . describe south-c_city
    
                  storage   display    value
    variable name   type    format     label      variable label
    ------------------------------------------------------------------------------
    south           byte    %8.0g      south      lives in south
    smsa            byte    %9.0g      smsa       lives in SMSA
    c_city          byte    %8.0g      c_city     lives in central city
    I should mention that this example works only for the current label language. Anyway, I hope some of you will find these additions useful.

    Best
    Daniel

  • #2
    In the current issue of The Stata Journal (19-4), I formally introduce the elabel command (Klein 2019). The latest version of the software, however, is currently only available from the SSC.

    Best
    Daniel

    Comment


    • #3
      I have submitted an updated version of elabel to the Stata Journal earlier this year. There have been more updates since. I will demonstrate the two most recent updates in this, somewhat lengthy, post.

      The first update adds a new option to elabel recode that automatically recodes variables that have the recoded value label attached. Here is how that works.

      First, we load an example dataset

      Code:
      . webuse fullauto
      (Automobile Models)
      
      . describe rep78
      
                    storage   display    value
      variable name   type    format     label      variable label
      --------------------------------------------------------------------------------------------------------------------------------------------------
      rep78           int     %9.0g      repair     Repair Record 1978
      Unlike in auto.dta, which is shipped with Stata, in this dataset, variable rep78 has a value label attached:

      Code:
      . label list repair
      repair:
                 1 Poor
                 2 Fair
                 3 Average
                 4 Good
                 5 Excellent
      Here are the frequencies of the cars' repair records in 1978:

      Code:
      . tabulate rep78
      
           Repair |
      Record 1978 |      Freq.     Percent        Cum.
      ------------+-----------------------------------
             Poor |          2        2.90        2.90
             Fair |          8       11.59       14.49
          Average |         30       43.48       57.97
             Good |         18       26.09       84.06
        Excellent |         11       15.94      100.00
      ------------+-----------------------------------
            Total |         69      100.00
      Now suppose, we want to reverse the coding so that high values represent a poor repair record; obviously, we want to modify the value labels accordingly. We can modify the value label with elabel recode; and, we can now add the new option recodevarlist to simultaneously change the values in rep78.

      Code:
      . elabel recode repair (1/5 = 5/1) , recodevarlist
      (rep78: 39 changes made)
      (rep77: 39 changes made)
      The output indicates that all but the 30 values representing "fair" repair records in rep78 have been changed as requested. Also, the output suggests that the values in the variable rep77 have been changed. Why is that?

      Code:
      . describe rep77
      
                    storage   display    value
      variable name   type    format     label      variable label
      --------------------------------------------------------------------------------------------------------------------------------------------------
      rep77           int     %9.0g      repair     Repair Record 1977
      It turns out that there is another variable in the dataset, rep77, that shares the same value label with variable rep78. Changing that value label without changing all variables that have it attached would certainly cause problems. elabel recode makes sure that all variables that share a modified value label are changed accordingly. But it does not stop there. Consider yet another version of the auto dataset

      Code:
      . webuse autom , clear
      (1978 Automobile Data)
      
      . describe foreign
      
                    storage   display    value
      variable name   type    format     label      variable label
      --------------------------------------------------------------------------------------------------------------------------------------------------
      foreign         byte    %22.0g     origin     Car type
      
      . label list origin
      origin:
                 0 Domestic
                 1 Foreign
      Suppose, we want to reverse the coding in the value label origin and in the variable foreign.

      Code:
      . elabel recode origin (0/1 = 1/0) , recodevarlist
      variable foreign has value label origen attached in label language es
      r(498);
      Here, elabel recode refuses to make the changes. It refuses to make the changes because the dataset has a second label language, es, and the variable foreign has a different value label attached in label language es. Sounds complicated? It is not. But the issue is easily overlooked.

      Code:
      . label language es
      
      . describe foreign
      
                    storage   display    value
      variable name   type    format     label      variable label
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      foreign         byte    %22.0g     origen     Origen
      Had we changed the values in value label origin and in the variable foreign, we would certainly be in trouble now that we have changed the label language to es.

      The illustrated problem with multiple label languages can be easily solved by including the value label origen in the list of value labels that we pass to elabel recode and making sure that it is also modified.

      However, there is another approach that builds on the second update to elabel that I want to show. We can use the regular recode command to change the values in the variable foreign, and the new prefix command elabel adjust to automatically modify the value labels accordingly.

      Code:
      . elabel adjust : recode foreign (0 = 1) (1 = 0)
      (foreign: 74 changes made)
      As expected, all 74 values in foreign have been changed; I will spare the output. Let us verify that the labels have been modified as well.

      Code:
      . label list origin origen
      origin:
                 0 Foreign
                 1 Domestic
      origen:
                 0 Producido fuera de USA
                 1 Producido en USA
      We can see that the value labels have been modified correctly in both label languages.

      I will end these examples by pointing out that elabel adjust may be used with replace, mvencode, and mvdecode, too. This might come in handy if we wanted to change numeric values to missing values and keep the value label as in

      Code:
      . elabel adjust : mvdecode foreign , mv(0 = .z)
           foreign: 22 missing values generated
      
      . label list origin origen
      origin:
                 1 Domestic
                .z Foreign
      origen:
                 1 Producido en USA
                .z Producido fuera de USA
      
      . tabulate foreign , missing
      
                      Origen |      Freq.     Percent        Cum.
      -----------------------+-----------------------------------
            Producido en USA |         52       70.27       70.27
      Producido fuera de USA |         22       29.73      100.00
      -----------------------+-----------------------------------
                       Total |         74      100.00

      For those of you, who are interested, the latest version of elabel is now (and will be) available from GitHub. In Stata, type

      Code:
      net install elabel , from(https://raw.githubusercontent.com/kleindaniel81/elabel/master)
      to install the files. I have also sent the latest version of elabel to Kit Baum for upload to the SSC.

      Comment

      Working...
      X