Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Giving a label to a large list of variables based on another dataset

    I have 250 variables that I need to give a label to. e.g. subj101, subj102 subj103 subj104

    I have another simple spreadsheet of the variable names and labels i would like:
    varname desired_label
    subj101 Any foreign language
    subj102 Any foreign language Extension unit
    subj103 Aboriginal Studies
    subj104 Agriculture
    subj105 Ancient History
    subj106 Biology
    subj107 Business Studies
    subj108 Ceramics
    subj109 Chemistry
    subj110 Community and Family Studies
    subj111 Computing Applications
    subj112 Dance
    subj113 Design and Technology
    subj114 Drama
    subj115 Earth and Environmental Science
    subj116 Economics
    How can I easily bulk relabel all of these?

    Cheers,
    Mike

  • #2
    While there are probably more elegant ways to do this, one easy solution is to append the no-label file to the labeled file. Note that this must be done in this order, i.e., the file with the labels must be resident, and the file w/o labels is appended as the "using file." One then keeps only the observations that came from what was the unlabeled file
    Code:
    // Make example of original file to be labeled.
    sysuse auto, clear
    keep weight turn price // just a subset
    // strip out labels to model user's situation
    label var weight ""
    label var turn ""
    label var price ""
    tempfile orig
    save `orig'
    desc weight turn price // look ma no labels
    // Make example file that has labels for illustration.  It has variables outside the subset
    // of interest but this will not be a problem.
    sysuse auto, clear
    //
    // Actual solution starts here.  Note that labeled file is resident.
    keep in 1  // Only 1 observation needed to carry the labels
    append using `orig', gen(source)
    keep if source == 1
    keep weight turn price // Just the variables of interest
    desc

    Comment


    • #3
      Thanks very much!

      Comment


      • #4
        Mike Lacy outlined a nice solution. Note that you would not even have to keep a single observation in the labeled dataset. Thus,

        Code:
        // Actual solution starts here. Note that labeled file is resident.
        keep in 1 // Only 1 observation needed to carry the labels
        append using `orig', gen(source)
        keep if source == 1
        keep weight turn price // Just the variables of interest
        could be reduced to

        Code:
        // Actual solution starts here. Note that labeled file is resident.
        drop in 1/l // <- new; lose all observations will still keep variables and labels
        append using `orig' // <- no observations get appended so no need to track anything
        keep weight turn price // Just the variables of interest
        Roger Newson uses this trick in his vallabsave package (SSC) for value labels.

        From the initial query, I got the impression that there was no labeled dataset but a spreadsheet that maps variable names to variable labels. Obviously, the outlined approach will not work in this situation.

        Best
        Daniel

        Comment

        Working...
        X