Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to check if value labels are consistently defined across multiple data sets using a loop?

    Hi,

    I'm trying to append some 35 data sets, one for each state. I used a loop to append them and saved as a new file. These data sets have identical variable names. However, a problem I have faced earlier in another data is despite variable names being same, and the question being same, the the original value labels were not identical across the multiple data sets. Thus an append without checking the individual labels had left me with nonsensical value labels in the appended data.

    Now due to this large no. of data sets and large no. of variables, I can't manually check if the labels are in order and was wondering if there could be a loop which would solve this. I would like to do this check before I append the data sets.

    Thanks

  • #2
    Can you give an example? I'm assuming that the values are all the same but the value labels are different. If so, it might be easier to drop the value label, define a new one, and then assign it to the variable.

    Comment


    • #3
      How about this? I use -elabel- , which can be installed by typing
      Code:
      ssc install elabel
      Here I artificially modify one label to demonstrate how it works. Also, you will need to put the operative bit of the code in a loop over the 35 state files.

      Code:
      sysuse nlsw88, clear
      label dir
      local val_labels `r(names)'
      
      label define racelbl 1 "Nonsense", modify
      
      foreach lab of local val_labels {
          label copy `lab' _`lab'
          label drop `lab'
      }
      
      label save using labels.do, replace
      
      * your -for- loop over state files will come here and replace the next two lines
      local file nlsw88
      sysuse `file', clear
      
      do labels.do
       
      foreach lab of local val_labels {
          noisily dis as text "Checking value label `lab' in file `file'... " _continue
          capture elabel compare `lab' _`lab', assertidentical
          if _rc != 0 dis as err "does not match!!"
              else noi dis as text "okay."
      }
      
      erase labels.do
      This produces:
      Code:
      Checking value label indlbl in file nlsw88... okay.
      Checking value label unionlbl in file nlsw88... okay.
      Checking value label ccitylbl in file nlsw88... okay.
      Checking value label southlbl in file nlsw88... okay.
      Checking value label occlbl in file nlsw88... okay.
      Checking value label nev_mar in file nlsw88... okay.
      Checking value label marlbl in file nlsw88... okay.
      Checking value label racelbl in file nlsw88... does not match!!
      Checking value label smsalbl in file nlsw88... okay.
      Checking value label gradlbl in file nlsw88... okay.
      Last edited by Hemanshu Kumar; 08 Sep 2022, 10:12.

      Comment


      • #4
        I would decode to string values and then apply encode to the appended result. In between, some cleaning up may be needed, depending on unwanted presentation differences, such as leading and trailing spaces, inconsistent internal spaces, inconsistent punctuation otherwise and inconsistent use of upper and lower case. Indeed, any such differences might be contributing to inconsistent value labels.

        Comment

        Working...
        X