Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to improt value labels in STATA from a second CSV-file?

    I have two CSV files. One with all observations and one with value labels

    The tables is as follows.

    adjunkt1:
    aar mand hoved
    1988 0 2
    1979 1 2
    1967 0 1

    labels1:
    v1 v2 v3
    mand 1 Mand
    mand 0 Kvinde
    hoved 1 ARTS
    hoved 2 ST
    hoved 99 Andet



    As I have understood I need to create the labels automatically through forvalues-loop and then a foreach-loop to apply those labels to adjunkt1.dta but I have not been able to succeed.
    Or maybe there is another way? Hope you can help me.




    storage display value
    variable name type format label variable label
    ---------------------------------------------------------------------------------------------------------------------------------------------
    v1 str7 %9s


    ---------------------------------------------------------------------------------------------------------------------------------------------
    v1 (unlabeled)
    ---------------------------------------------------------------------------------------------------------------------------------------------

    type: string (str7)

    unique values: 95 missing "": 0/476



  • #2
    Code:
    // READ IN THE FILE WITH THE LABELING INFORMATION
    // IN APPLICATION THIS WILL BE -use labels1-
    input str32 v1 int v2 str100 v3
     mand 1 Mand
     mand 0 Kvinde
     hoved 1 ARTS
     hoved 2 ST
     hoved 99 Andet
     end
     
    // CREATE MNEMONIC VARIABLE NAMES
    rename v1 label_name
    rename v2 value
    rename v3 value_label
     
    // BUILD UP THE VALUE LABELS
    by label_name (value), sort: gen label_command_1 = "label define " + label_name ///
     + " " + string(value, "%1.0f") + `" ""' + value_label + `"""' if _n == 1
    by label_name (value): replace label_command_1 = label_command_1[_n-1] ///
     + " " + string(value, "%1.0f") + `" ""' + value_label + `"""' if _n > 1
    by label_name (value): keep if _n == _N
    
    // AND COMMANDS TO APPLY THE VALUES TO THE VARIABLES
    gen label_command_2 = "label values " + label_name + " " + label_name
    
    // RESHAPE LONG TO LOOK LIKE A FILE OF COMMANDS ON SEPARATE LINES OF A DO FILE
    keep label_name label_command*
    reshape long label_command_, i(label_name) j(_j)
    
    // WRITE IT ALL OUT TO A DO FILE
    // IF DESIRED, CAN USE A TEMPFILE RATHER THAN
    // PERMANENT FILE labeler.do
    file open handle using labeler.do, write text replace
    forvalues i = 1/`=_N' {
     file write handle (label_command[`i']) _n
    }
    file close handle
    
    // READ IN DATA TO BE LABELED
    // IN PRODUCTION THIS WILL BE use adjunkt1
    clear
    input aar mand hoved
     1988 0 2
     1979 1 2
     1967 0 1
     end
    
    // EXECUTE THE LABELER FILE
    do labeler
    
    // SHOW THE RESULTS
    des
    list, noobs clean
    exit

    Comment


    • #3
      I have needed to do something similar to this to a few times with datasets accompanied by dictionary files - files containing value label assignments for the dataset variables arranged as in your labels1 csv.

      The approach I have taken was somewhat similar to Clyde's, but simplifies the generation of value labels by using -label def, add- in a forvalues loop as you seemed to have been thinking originally, and then by using -label save- to generate the "labeler" do-file rather than a series of -file- commands.

      Eg. assuming adjunkt1.csv and labels1.csv exist with the contents included in the original post, a do-file to generate and apply the labels in labels1.csv to adjunkt1.csv and save the results in a Stata format file, adjunkt1_v1.dta, would look like:

      Code:
      /* tmp_label.do
      
        apply value labels from TMP's labels1.csv to adjunkt1.csv and save as Stata file
      
      */
      // read file with value labels and rename with Clyde-suggested variable names
      import delim using labels1.csv, delim(" ") varn(1) clear
      rename v1 label_name
      rename v2 value
      rename v3 value_label
      
      // define the labels
      qui count
      local n `r(N)'
      forvalues i = 1/`n' {
          la def `=label_name[`i']' `=value[`i']' "`=value_label[`i']'", add
      }
      
      // save value labels in a temporary do-file with the -label- command
      tempfile vlbl
      label save using `vlbl', replace
      
      // read dataset file
      import delim using adjunkt.csv, delim(" ") varn(1) clear
      
      // define labels by running saved do-file
      do `vlbl'
      
      // assign value labels to variables
      la val mand mand
      la val hoved hoved
      
      // save labeled dataset in a Stata format file
      save adjunkt1_v1, replace

      Comment


      • #4
        Thank you for your help.
        Clyde Schechter it went perfectly as it should. And your notes were very useful.

        Comment

        Working...
        X