Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • regular expression within subinstr()

    Hello,

    I'm trying to extract dates (in mm/dd/yyyy format) that are my variables' labels. I want to use them to rename my variables (which are unhelpfully called v39-v41 at the moment). However, I also want to get rid of the dd part of the date string, replace / with _, and add a string prefix to each.

    Here is some sample data.

    Code:
    clear
    input str3(v39 v40 v41)
    "" "" ""
    "" "" ""
    "" "" ""
    "" "" ""
    "" "" ""
    end
    label var v39 "05/16/2017" 
    label var v40 "11/08/2016" 
    label var v41 "08/15/2016"

    I want v39 to be renamed elect_2017_05, v40 to be elect_2016_11, and v41 to be elect_2016_08.

    So far, I can pull out the forward slashes and replace them with underscores with this code:

    Code:
    forvalues i = 39/63{
        local newname subinstr("`:variable label v`i''", "/", "_",.)
        di `newname'
        rename v`i' elect_`=`newname''
    }
    Trying to take it a step further to remove everything between (and including) the forward slashes, i tried this:

    Code:
    local newname subinstr("`:variable label v39'", "/[0-9][0-9]/", "_",.)
    di `newname'
    but to no avail.

    If anyone has any ideas, they'd be gratefully received

  • #2
    Is this what you want? (Best just to show us an example variable name as wanted.)

    Code:
    clear
    input str3(v39 v40 v41)
    "" "" ""
    "" "" ""
    "" "" ""
    "" "" ""
    "" "" ""
    end
    label var v39 "05/16/2017" 
    label var v40 "11/08/2016" 
    label var v41 "08/15/2016"
    
    
    forvalues i = 39/41 {
        local newname = subinstr("`:variable label v`i''", "/", "",.)
        rename v`i' elect_`newname'
    }
    
    d
    
    Contains data
      obs:             5                          
     vars:             3                          
     size:            45                          
    --------------------------------------------------------------------------------------
                  storage   display    value
    variable name   type    format     label      variable label
    --------------------------------------------------------------------------------------
    elect_05162017  str3    %9s                   05/16/2017
    elect_11082016  str3    %9s                   11/08/2016
    elect_08152016  str3    %9s                   08/15/2016
    --------------------------------------------------------------------------------------
    Sorted by: 
         Note: Dataset has changed since last saved.

    Comment


    • #3
      Not quite (I think I did).

      I want to rename v39 to elect_2017_05, v40 to elect_2016_11, and v41 to elect_2016_08.

      So just looking at v39 as an example: I want to take the variable label 05/16/2017, remove the forward slashes and everything in between (including the day) and then flip the year with the month. So it becomes 2017_05 (i'll also add the prefix elect_ to all the variables).

      My thinking was that a regular expression match something along the lines of


      Code:
      local newname subinstr("`:variable label v39'", "/[0-9][0-9]/", "_",.)
      would work (at least to get rid of the day) but it doesn't.

      Comment


      • #4
        Just use regexr:

        Code:
        clear
        input str3(v39 v40 v41)
        "" "" ""
        "" "" ""
        "" "" ""
        "" "" ""
        "" "" ""
        end
        label var v39 "05/16/2017"
        label var v40 "11/08/2016"
        label var v41 "08/15/2016"
        
        forvalues i = 39/41{
            local newname regexr("`:variable label v`i''", "/[0-9][0-9]/", "_")
            di `newname'
            rename v`i' elect_`=`newname''
        }

        Comment


        • #5
          Oh I see. You did say that.


          Code:
          clear
          input str3(v39 v40 v41)
          "" "" ""
          "" "" ""
          "" "" ""
          "" "" ""
          "" "" ""
          end
          label var v39 "05/16/2017"
          label var v40 "11/08/2016"
          label var v41 "08/15/2016"
          
          forvalues i = 39/41 {
              local newname = subinstr("`:variable label v`i''", "/", " ",.)
              tokenize `newname'
              rename v`i' elect_`3'_`1'
          }
          
          d
          
          Contains data
            obs:             5                          
           vars:             3                          
           size:            45                          
          --------------------------------------------------------------------------------------
                        storage   display    value
          variable name   type    format     label      variable label
          --------------------------------------------------------------------------------------
          elect_2017_05   str3    %9s                   05/16/2017
          elect_2016_11   str3    %9s                   11/08/2016
          elect_2016_08   str3    %9s                   08/15/2016
          --------------------------------------------------------------------------------------
          Sorted by:
               Note: Dataset has changed since last saved.
          Last edited by Nick Cox; 21 Feb 2018, 14:59.

          Comment


          • #6
            That's a neat trick with tokenize.

            Comment


            • #7
              Originally posted by Dave Airey View Post
              That's a neat trick with tokenize.
              Agreed! Thanks a lot Nick

              Comment


              • #8
                This works too:

                Code:
                forvalues i = 39/41 {
                    tokenize `:variable label v`i'', parse(/)
                    rename v`i' elect_`5'_`1'
                }

                Comment

                Working...
                X