Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to assign variable labels according to values of a variable on a different dta file

    Hi there, I was hoping you could help me to assign variable labels using the values of a variable within a different dta file.

    Here are the example dta files:

    Code:
    clear
    
    input sampleid seq1 seq2 seq3
    1 . . .
    2 . . .
    3 . . .
    end
    
    save seqdata.dta, replace
    
    clear
    
    input str4 aptname str5 targetfullname
    seq1 name1
    seq2 name2
    seq3 name3
    end
    
    save seqlabels.dta, replace
    I am trying to label the variables seq1, seq2 and seq3 in seqdata.dta using the variable targetfullname in seqlabels.dta. Below I have code where I am trying to order the variables in seqdata.dta so that I am then able to label, however, I get the following error: factor-variable and time-series operators not allowed

    Code:
    use seqlabels.dta, clear
    sort aptname
    levelsof aptname, local(seqs) clean
    levelsof targetfullname, local(names) clean
    
    use seqdata.dta, clear
    order sampleid `seqs'
    I then planned to label the var using the local `names', I would be grateful for some help with this!

    Thank you so much in advance!

    Very best,
    Liz

  • #2
    Removed the original suggest as my it didn't take the sorting behavior of levelsof() into account. See #3.

    EDIT: After some thought, here is another approach:

    Code:
    clear
    
    input str4 aptname str15 targetfullname
    seq1 name1
    seq2 "hello! name2"
    seq3 "name 3"
    end
    
    save seqlabels.dta, replace
    
    generate to_do = "label variable " + aptname + `" ""' + targetfullname + `"""'
    
    levelsof to_do, local(td)
    
    *-------------------------------------------------------------------------------
    clear
    
    input sampleid seq1 seq2 seq3
    1 . . .
    2 . . .
    3 . . .
    end
    
    save seqdata.dta, replace
    
    foreach x in `td'{
        `x'
    }
    Result:

    Code:
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ----------------------------------------------------------------------------
    sampleid        float   %9.0g                 
    seq1            float   %9.0g                 name1
    seq2            float   %9.0g                 hello! name2
    seq3            float   %9.0g                 name 3
    Last edited by Ken Chui; 06 Mar 2024, 10:32.

    Comment


    • #3
      Originally posted by Liz Broom View Post
      Hi there, I was hoping you could help me to assign variable labels using the values of a variable within a different dta file.


      levelsof aptname, local(seqs) clean
      levelsof targetfullname, local(names) clean
      This won't work because -levelsof- will sort the variable names and label names alphabetically within each local, thus not guaranteeing the correct match between variable name and label. Here is a way to achieve what you want.

      Code:
      clear
      
      input sampleid seq1 seq2 seq3
      1 . . .
      2 . . .
      3 . . .
      end
      
      save seqdata.dta, replace
      
      clear
      
      input str4 aptname str5 targetfullname
      seq1 name1
      seq2 name2
      seq3 name3
      end
      
      save seqlabels.dta, replace
      
      use seqdata, clear
      merge 1:1 _n using seqlabels, nogen
      local i 1
      while !missing("`=aptname[`i']'"){
          lab var `=aptname[`i']' `=targetfullname[`i']'
          local ++i
      }
      drop aptname target
      Res.:

      Code:
      . desc
      
      Contains data from seqdata.dta
       Observations:             3                  
          Variables:             4                  6 Mar 2024 17:22
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      Variable      Storage   Display    Value
          name         type    format    label      Variable label
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      sampleid        float   %9.0g                 
      seq1            float   %9.0g                 name1
      seq2            float   %9.0g                 name2
      seq3            float   %9.0g                 name3
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      Sorted by: 
           Note: Dataset has changed since last saved.
      
      .

      Comment


      • #4
        Thank you both so much for your replies - both worked using the example datasets.

        Andrew, I used yours with my dataset and it worked perfectly. Ken, I didn't use your solution in the end because I didn't fully understand levelsof so thought it best to not use that command without some prior reading!

        Really appreciate the help, thanks both!

        Comment


        • #5
          Hi, I have a new situation where the list of variable name/label combinations (i.e. observations in aptname and targetfullname) contains variable names (aptname) not present in my seqdata.dta var list.

          How can I edit the code to skip over aptname if not present in seqdata.dta varlist?

          I also have prefixes in my var names i.e. X_X_var, where X could be any number from 1 to 3 characters long (e.g. some example var may be: 123_12_abcd 1_1_efg 34_23_h). Is there a way of matching the aptname to only the section of the var name after the last underscore?

          I thought one way to do it would be to save a new dta file containing 1 variable named aptname which contains all the variable names in seqdata.dta, each variable name as a separate observation, then removing the prefixes, then merging with seqlabels.dta and only keeping the observations that matched, before merging with the seqdata.dta and running the rest of the code as above. However, I am unsure how to gen a var of the entire varlist in seqdata.dta?

          Many thanks,
          Liz

          Comment


          • #6
            As always, such problems are best illustrated with a reproducible example, but it's your call!

            Comment


            • #7

              This approach will skip over variables in seqlabels.dta that do not occur in seqdata.dta:
              Code:
              clear*
              
              input sampleid seq1 seq2 seq4
              1 . . .
              2 . . .
              3 . . .
              end
              
              frame create labeler
              frame change labeler
              input str4 aptname str5 targetfullname
              seq1 name1
              seq2 name2
              seq3 name3
              seq4 name4
              end
              
              forvalues i = 1/`=_N' {
                  local vname = aptname[`i']
                  local vlbl = targetfullname[`i']
                  display `"`vname': `vlbl'"'
                  frame default {
                      capture confirm var `vname', exact
                      if c(rc) == 0 {
                          label var `vname' `"`vlbl'"'
                      }
                  }
              }
              
              frame change default
              des
              Note: This code could be rewritten without the use of frames, just -merge-ing the two data sets together as in #3. The real change here is the use of -capture confirm var- and then conditionally using -label var-. But I dislike data organizations where all the variables in a single observation do not refer to the same unit of analysis, so I think the use of separate frames here is conceptually cleaner.

              Comment

              Working...
              X