Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata code for value labels from table with information

    Hi!

    I have to assign value labels to a large data set. All necessary information is saved in an excel table that I loaded into stata. For some variables, there are more items (e.g., v32200_1 v3220_2 ...) which all have the same scale and therefore get the same lable. I removed them therefore from the data. But they need to be assigned the value label (therefore I nedd to code var*). I need a stata code that can use the data and will give me as an output a text file with the following code for each variable:
    label define varlbl 1 "yes" 2 "no"
    label var* varlbl


    var values envallab
    v32200 1 yes
    v32200 2 no
    v32200 -7 I do not want to answer
    v32200 -8 I cannot answer that
    v32202 1 very good
    v32202 2 rather good
    v32202 3 rather bad
    v32202 4 very bad
    v32202 -7 I do not want to answer
    v32202 -8 I cannot answer that
    v32203 1 very often
    v32203 2 often
    v32203 3 sometimes
    v32203 4 not at all
    v32203 -7 I do not want to answer
    v32203 -8 I cannot answer that
    v32204 1 completely true
    v32204 2 a bit true
    v32204 3 rather not true
    v32204 4 not true at all
    v32204 -7 I do not want to answer
    v32204 -8 I cannot answer that
    ...

    Very greatful for any help to get this job done!

    Best, Katrina

  • #2
    We can define the labels

    Code:
    //load in your example data
    clear
    input str6 var byte values str23 envallab
    "v32200" 1  "yes"
    "v32200" 2  "no"
    "v32200" -7 "I do not want to answer"
    "v32200" -8 "I cannot answer that"
    "v32202" 1  "very good"
    "v32202" 2  "rather good"
    "v32202" 3  "rather bad"
    "v32202" 4  "very bad"
    "v32202" -7 "I do not want to answer"
    "v32202" -8 "I cannot answer that"
    "v32203" 1  "very often"
    "v32203" 2  "often"
    "v32203" 3  "sometimes"
    "v32203" 4  "not at all"
    "v32203" -7 "I do not want to answer"
    "v32203" -8 "I cannot answer that"
    "v32204" 1  "completely true"
    "v32204" 2  "a bit true"
    "v32204" 3  "rather not true"
    "v32204" 4  "not true at all"
    "v32204" -7 "I do not want to answer"
    "v32204" -8 "I cannot answer that"
    end
    
    // check if each variable-value combination appears only once
    isid var values
    
    // make sure that all labels for the same variable appear together
    sort var values
    
    // drop all labels
    label drop _all
    
    // create the labels
    forvalues i = 1/`=_N' { // loop over observations
        local labname = var[`i']
        local labname = "`labname'_lb"
        local val = values[`i']
        local lab = envallab[`i']
        label define `labname' `val' `"`lab'"', modify
    }
    
    // see the labels
    label list
    
    // store the labels in a .do file
    label save using "c:\temp\labels.do", replace
    
    // see the .do file
    doedit "c:\temp\labels.do"
    And than attach them to your data

    Code:
    clear all
    
    // open the data
    sysuse mydata.dta
    
    // define the labels
    do labels.do
    
    // find all non-string variables (you cannot label strings)
    ds , not(type string)
    
    // attach the labels
    foreach var of varlist `r(varlist)' {
        label values `var' `var'_lb
    }
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Dear Maarten, thank you, it looks like what I need. I will try it out next week and let you know if I run into any issues. Only thing is, I will need it to also label items with the same variable stem. But I guess in the very last step, this change should do the job:
      foreach var of varlist `r(varlist)' { label values `var'* `var'_lb
      } Best wishes, Katrina

      Comment


      • #4
        Dear Maarten,
        Thanks again for the code. Actually, my suggestion to introduce the wildcard does not solve the problem. The code does not run, because I reintroduce string variables again. Also, theoretically the variable stem, that was used to define the value label could be in the dataset as a string, while other variables with the same stem are numeric variables. An example: v32202 Open question about diverse gender (string), v32202_1 numeric variable, with different genders coded in categories. Do you have any idea, how to solve this? Somehow I have to condition for only numeric variables within the loop.Something like: if type(`var'*)==numeric... But obviously, that command does not exist. Best, Katrina

        Comment


        • #5
          I am just realizing that there is a problem in your code earlier. I need to assign the variable label v32202_lb to the variables v32202_1a and v32202_1b and v32202_2a and so on. As I stated originally, the value label is defined for a variable stem, but there are many variables with suffixes where the same value labe appliesl. With your code, I would need extra variables for all those variables. Since you get the local 'var' from the data set, you are already includng the suffix in the local. I need a code of the kind: Label var* var_lb. Var and var_lb schould only contain sthe variable-stem.

          Comment


          • #6
            Without example data from your target data set (as opposed to the data set with the label information) I cannot test this. But I believe that the following modification of Maarten Buis' code in #2 will do what you ask.
            Code:
            clear all
            
            // open the data
            sysuse mydata.dta
            
            // define the labels
            do labels.do
            
            label dir
            local labels `r(names)'
            local stems: subinstr local labels "_lb" "", all
            
            foreach s of local stems {
                ds `s'*, has(type numeric)
                label values `r(varlist)' `l'_lb
            }

            Comment


            • #7
              Hi Clyde, yes, that looks very similar to what I have come up with now. Thank you!

              The data where the code should be applied is a bit inconsistent with the information that I am using about the value labels (that I showed above). And there are also mistakes. So, I need the code in a do-file to then be able to weed it out and make changes by hand. I solved it like this now:
              duplicates drop var, force
              capture file close myfile
              file open myfile using "\...", write replace
              levelsof var, local(levels)
              foreach lvl in `levels' {
              file write myfile "ds `lvl'*, not(type string)" _n
              file write myfile "label values r(varlist) `lvl'_lb" _n
              }
              file close myfile
              Last edited by Katrina Blindow; 16 Dec 2024, 09:52.

              Comment

              Working...
              X