Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Selectively edit variable label in loop(s)

    Dear Statalist,
    I have a large survey panel dataset that includes various waves of the survey. Sadly, not all questions are asked in all waves of the survey. Hence, to keep track of which variable/question appears in which wave of the survey, I want to edit the variable label with the number of the waves the question appears in.

    So let's say, we have the question for the gender of a participant in waves 1, 2, and 5; and the question of how much the participant likes apples in waves 1, 3, and 6 of the survey.
    My dataset hence contains the variables w1_sex, w2_sex, and w5_sex as well as w1_appel, w2_apple, and w6_apple.
    All variables in the dataset are named in the following way: w`number_of_wave'_`abbreviation_of_quetsion'
    What I intend to do is to rewrite the labels of all w*_sex variables from "Gender of participant" to " 1 2 5 Gender of participant" and of all w*_appel labels from "Likes Appels (scale 1 not at all to 5 very much)" to "1 3 6 Likes Appels (scale 1 not at all to 5 very much)".

    I've been toying around a lot, but have not seen any promising results yet.
    Here is just a sample of how I tried to approach the problem.
    Code:
    Code:
    foreach num of numlist 1(1)12 {
        gen prefix = ""
        foreach var of varlist w`num'_*{
                        capture confirm variable y`num'_*
                            if !_rc {
                                replace prefix = prefix + "`num'"
                            }    
                            else {
                                local varlabel : var label `var'
                                local newlabel "`prefix' `varlabel'"
                                label var `var' "`newlabel'"
                            }
        }
       drop prefix
    }
    Thank you very much for your help in advance. I've been a long-time reader of the forum but this is the first time I honestly could not find any help with my problem.
    Best,
    Fabian
    Last edited by Fabian Mierisch; 21 Jan 2022, 13:04.

  • #2
    This will do what you ask:
    Code:
    //  CREATE DEMONSTRATION DATA SET
    
    clear*
    
    set obs 10
    set seed 1234
    gen w1_sex = runiformint(0, 1)
    gen w2_sex = runiformint(0, 1)
    gen w5_sex = runiformint(0, 1)
    
    gen w1_apple = runiformint(1, 5)
    gen w3_apple = runiformint(1, 5)
    gen w6_apple = runiformint(1, 5)
    
    foreach v of varlist *_sex {
        label var `v' "Participant Sex"
    }
    
    foreach v of varlist *_apple {
        label var `v' "Likes Apples"
    }
    
    //  ILLUSTRATE APPROACH
    
    //  BEGIN BY CREATING A LIST OF ITEM TOPICS
    ds w*_*
    local vbles `r(varlist)'
    local topics
    foreach v of local vbles {
        local uscp = strpos("`v'", "_")
        local w = substr("`v'", `uscp'+1, .)
        local topics `topics' `w'
    }
    local topics: list uniq topics
    display `"`topics'"'
    
    foreach t of local topics {
    //  IDENTIFY WHICH WAVES THE TOPICS OCCUR IN
        local waves
        ds *_`t'
        local vbles `r(varlist)'
        local waves: subinstr local vbles "_`t'" "", all
        local waves: subinstr local waves "w" "", all
    //  AND MODIFY THE LABELS ACCORDINGLY
        foreach v of varlist `vbles' {
            label var `v' `"`waves' `:var label `v''"'
        }
    }
    Note: This code depends rigidly on the variable name pattern being w#_topic. If there are some variables that violate that pattern, this code will not deal with them.

    That said, you have this survey data in wide layout, which will almost surely prove unworkable when it comes time to analyze the data, and probably even while you still work on cleaning it. I strongly recommend that you -reshape- it to long layout--which will make almost everything you try to do easier. If you do that, you will probably find that you don't even need to do anything similar to this task since there will only be a single instantiation of each topic (i.e. sex, liking apples, etc.) and in long layout, it is rarely if ever necessary to know ahead of time which waves have data for which items anyway.
    Last edited by Clyde Schechter; 21 Jan 2022, 14:28.

    Comment


    • #3
      Dear Prof. Schechter,
      thank you for the code, works like a charm.
      I think I learned from it quite a bit. Also, thank you for the advice to reshape the data. I am on it right now.
      Thanks again,
      Fabian

      Comment

      Working...
      X