Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing characters from end of variable name

    Hi alll,

    Would really appreciate some help with this, its so far completed stumped me!

    I am importing csv files with variable names over 32 characters. I want to create new variables that add to the variable names and therefore get an error message because the new names will be >32.

    I therefore want to open each csv file, delete say the last 5 characters of variable names with 32 characters and replace with a number. The end result being that a variable named:

    "reallyoverlylongvariablenameits" = "reallyoverlylongvariablen"
    "notsolong" = "notsolong"
    "anotherridiculouslylongvariable1" "anotherridiculouslylongvariab"

    This is what I have so far, but I really don't know how to get it working!

    Code:
    clear
    local myfilelist : dir . files"*.csv"
    foreach file of local myfilelist {
    insheet using `file'
    foreach var of varlist waistbeltattachknsbyfolder waistbelttopstitchkns waistbeltattachsn {
     local newname = substr(`var', ?,?) if *?
       rename `var' `x'
       }
    save "`outfile'", replace
    }
    One of the big problems is that I get an error that I cannot seem to resolve. Without the nested loop and taking just a subset of the data:
    Code:
    clear
    input str5 firm long id str10(waistbeltattachknsbyfolder waistbelttopstitchkns waistbeltattachsn waistbelttopstitchsn)
    "SMV"     . "100" "100" "80" "80"
    "7001"   78 ""    ""    ""   "50"
    "7001"  208 ""    ""    ""   ""  
    "7001"  227 ""    ""    ""   ""  
    "7001"  391 ""    ""    ""   "55"
    "7001"  406 ""    ""    ""   ""  
    "7001"  672 ""    ""    "50" ""  
    "7001"  817 ""    ""    ""   ""  
    "7001" 2067 ""    ""    "60" "70"
    end
    
    foreach var of varlist waistbeltattachknsbyfolder waistbelttopstitchkns waistbeltattachsn {
     local x = substr(`var', -10,10)
       rename `var' `x'
     }
    I get error: syntax error
    Syntax is
    rename oldname newname [, renumber[(#)] addnumber[(#)] sort ...]
    rename (oldnames) (newnames) [, renumber[(#)] addnumber[(#)] sort ...]
    rename oldnames , {upper|lower|proper}

  • #2
    Code:
    local x = substr(`var', -10,10)
    is legal and returns a substring of the data whenever the argument is the name of the string variable. You want the syntax to work on the name of the variable, which has to be different. (In your case the data wouldn't be legal as variable names. Also, Stata would extract from the value in the first observation only.)

    I think you want

    Code:
    local x = substr("`var'", -10,10)
    I note that renvars has a bundle of options for chopping bits off variable names.


    Code:
    . search renvars, historical
    
    Search of official help files, FAQs, Examples, SJs, and STBs
    
    SJ-5-4  dm88_1  . . . . . . . . . . . . . . . . .  Software update for renvars
            (help renvars if installed) . . . . . . . . .  N. J. Cox and J. Weesie
            Q4/05   SJ 5(4):607
            trimend() option added and help file updated
    
    STB-60  dm88  . . . . . . . .  Renaming variables, multiply and systematically
            (help renvars if installed) . . . . . . . . .  N. J. Cox and J. Weesie
            3/01    pp.4--6; STB Reprints Vol 10, pp.41--44
            renames variables by changing prefixes, postfixes, substrings,
            or as specified by a user supplied rule
    After a decade or so as contender, it was superseded in most respects by a revised rename, but some details are occasionally useful.
    Last edited by Nick Cox; 18 Oct 2016, 05:59.

    Comment


    • #3
      In addition, you can skip the local part within the loop and use the rename directly:
      Code:
       rename `var' `=substr("`var'", -10,10)'
      Or even better, skip all the loop part and use the rename group command:
      Code:
      rename waistbelt* *

      Comment


      • #4
        Hi Nick,

        Thanks so much for that, works perfectly for the single file. Thanks! My problem now is (..I think) with the importing csv files. It seems to bring in the first csv file, make changes but then has a problem with the output, replace part, I think. I get the error message invalid file specification, which seems odd because it is just a ',replace'

        Any ideas?

        Thanks!

        Comment


        • #5
          You're reporting a problem, speculating about what may be happening, but not showing us the code!

          Comment


          • #6
            Apologies Nick, still new to this. I am getting the problem for any simple task I am trying to do while importing csv files.

            Code:
            cd "/Users/Chris/Dropbox/Bangaldesh Chris/Skills matrices in excel/CB/Raw csv"
            clear
            local myfilelist : dir . files"*.csv"
            foreach file of local myfilelist {
            insheet using `file'
            gen abd = 1
            save "`outfile'", replace
            }
            From that I get the error message
            "local myfilelist : dir . files"*.csv"

            . foreach file of local myfilelist {
            2. insheet using `file'
            3. gen abd = 1
            4. save "`outfile'", replace
            5. }
            (69 vars, 53 obs)
            invalid file specification
            r(198);
            "

            Again, thanks,
            Chris

            Comment


            • #7
              Your report is unclear where the problem is. If the problem is in

              Code:
               
               local myfilelist : dir . files"*.csv"
              then try an extra space in that line.

              Code:
               
               local myfilelist : dir . files "*.csv"


              There is more trouble ahead.

              According to the code you show, you never define outfile, so Stata will interpret any reference to it as an empty string. That in turn is no use here.

              Comment


              • #8
                Hi Nick. Managed to resolve the problem with your help, thanks so much

                Comment


                • #9
                  Hey guys
                  I do have troubles with renaming multiple variables in one code. En example of my variables is:
                  step_0_hr_v2
                  step_0_sat_v2
                  step_1_hr_v2
                  step_1_sat_v2

                  How can i in one command remove the number 2 from the variables name?

                  Thank you
                  Massar

                  Comment


                  • #10
                    Code:
                    rename (*_v2) (*_v)
                    would work in many circumstances. If it doesn't work, it will be because other variable names you don't show us are a compllication.

                    Comment


                    • #11
                      Thank you very much Nick
                      that was exactly the clue.

                      Best Regards

                      Comment


                      • #12
                        Hey everybody! I have somewhat similar problem as Massar but a bit different. Maybe somebody here could help me out.

                        I have variables' names of type "var1_1, var1_2, var2_1, var2_2, ..."
                        Is there any way of creating global which will contain just "var1, var2, ...."?

                        Thanks,
                        Georgii

                        Comment


                        • #13
                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input float(make_99 var_2 studio_29e var_a make2_abc)
                          3 1 1 2 7
                          3 7 3 5 4
                          2 3 5 3 1
                          end
                          
                          local prefixes ""
                          foreach var of varlist*{
                              local prefix= cond(regexm("`var'", "([a-zA-Z0-9]+)[\_][a-zA-Z0-9]+"), regexs(1), "")
                              local prefixes `prefixes' `prefix'
                          }
                          local prefixes: list uniq prefixes
                          di "`prefixes'"
                          For some reason, if you run the code for the first time, you may get the error

                          a regular expression match function must be executed prior to retrieving a subexpression
                          regexm required
                          If this happens, run the code once again to get all prefixes.

                          Res.:

                          Code:
                          . di "`prefixes'"
                          make2 make var studio

                          Comment


                          • #14
                            #12

                            I think you want the stubs, that is, the parts of each variable name before the underscore

                            This is not a question about renaming as you could not rename var1_1 var1_2 to var1 as two variables can't have the same name.

                            This code creates a sandbox dataset and strips out the stubs. Georgii starts at local stubs

                            Code:
                            clear
                            set obs 1
                            foreach v in var1_1  var1_2 var2_1  var2_2 {
                                gen `v' = 42
                            }
                            
                            list 
                            
                            local stubs 
                            foreach v of var *_*  {
                                local this = substr("`v'", 1, strpos("`v'", "_") - 1)
                                local stubs `stubs' `this'
                            }
                            
                            local stubs : list uniq stubs 
                            di "`stubs'"


                            Using locals rather than globals is deliberate here. More at
                            https://journals.sagepub.com/doi/abs...urnalCode=stja

                            Comment


                            • #15
                              Andrew, Nick, thanks a lot for both of you, that's exactly what I need!
                              Best regards,
                              Georgii

                              Comment

                              Working...
                              X