Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Check if a group of variables exists before running a series of commands

    Hi I have other two questions for you all. I have a number of datasets (about 250) and I want to create an unique .do file in which storing a series of commands that can be run in all the datasets. Now, the first question is how can I say to Stata: "For all datasets in this folder, do the following commands"? - probably, I need to use a loop... Then, since I don't have exactly the same variables in all the datasets (the 90% of them are the same, but there are some variables missing or some variables more in each dataset), my second question is how to say to Stata: "Check if this group of variables exists. In the case it exists, do the following commands"?

    For example, let's suppose that I'm interested in three groups of variables: demographics (all these variables start with d*), income (all these variables start with y*) and assets (all these variables start with a*), but the group "assets" is not present in all the datasets. How can I say to Stata: "Check if variable group "assets" exists, and if it exists then do a certain command" (for example, replace all value with 0 if age is <18)?

    Thank you!!

  • #2
    Consider this experiment:

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . d a*
    variable a* not found
    r(111);
    
    . gen answer = 42
    
    . d a*
    
                  storage   display    value
    variable name   type    format     label      variable label
    --------------------------------------------------------------------------------------------------------
    answer          float   %9.0g                
    
    . di _rc
    0
    In the auto data as well, there are no variables whose names begin with a unless we add one ourselves. That being so, a code sketch for you is


    Code:
    capture describe a*
    
    if _rc == 0 {
         <stuff you want to do>
    }
    So that running the code is conditional on a zero return code because then that is evidence that a* exist.

    Comment


    • #3
      Ok, thank you!
      Do you also know how can I say to Stata "For each dataset in this folder, run the following commands?"

      Comment


      • #4
        Yes, I know that.

        Code:
        local files : dir . files "*.dta"
        
        foreach of of local files { 
              use `f', clear 
        
              ....
        }

        Comment


        • #5
          Sorry, I don't understand the first raw...
          Furthermore, I have an additional problem: my datasets are in ".txt": can I use the same code you wrote, changing ".dta" with ".txt"?

          Thank you!

          Comment


          • #6
            I don't understand in turn "first raw". If you mean the command starting local, then

            Code:
            help macro 
            help extended fcn
            lead to documentation of the syntax.

            use depends on the dataset being in Stata's own file format. An extension of .txt isn't fatal if the dataset has the right file format, but usually .txt would imply a need for a different input command and we need information on the format actually used to say. what it should be.

            Comment


            • #7
              Let me comment on post #4.

              It seems to me there's a small typographical error of the sort I usually find in my posts:
              Code:
              foreach of of local files {
              was intended to read
              Code:
              foreach f of local files {
              And to understand
              Code:
              local files : dir . files "*.dta"
              you start with the output of help local
              Code:
              Syntax
              
                      global   mname   [=exp | :macro_fcn | "[string]" | `"[string]"']
              
                      local    lclname [=exp | :macro_fcn | "[string]" | `"[string]"']
              ...
              and click on ":macro_fcn" (the leading colon is the key that it is what you are seeing) which takes you further down in the output to the beginning of the section on macro functions. Scroll down through that and you'll pass a brief mention of the dir macro function

              Code:
                  Macro functions for filenames and file paths
              
                      adosubdir ["]filename["]
              
                      dir ["]dirname["] {files|dirs|other} ["]pattern["] [, nofail respectcase]
              
                      sysdir [ STATA | BASE | SITE | PLUS | PERSONAL | dirname ]
              and continuing to scroll you'll see the start of the remarks section
              Code:
              Remarks
              
                  Remarks are presented under the following headings:
              
                      Macro function for extracting program properties
                      Macro function for extracting program results class
                      Macro functions for extracting data attributes
                      Macro function for naming variables
                      Macro functions for filenames and file paths
                      Macro function for accessing operating-system parameters
                      Macro functions for names of stored results
                      Macro function for formatting results
                      Macro function for manipulating lists
                      Macro functions related to matrices
                      Macro function related to time-series operators
                      Macro function for copying a macro
                      Macro functions for parsing
              Clicking on "Macro functions for filenames and file paths" takes you to the deeper explanation of the dir macro function.

              Comment


              • #8
                William Lisowski is indeed quite correct on my typo (sorry about that) and helpfully expands on the rest of my post.

                Comment


                • #9
                  Ok. Thus, let's assume I have 5 "*.txt" datasets in "C:\User\Desktop\Folder". Could the Code be as follow?

                  dir "C:\User\Desktop\Folder"
                  local files : dir . files "*.txt"
                  foreach f of local files {
                  import delimited ...
                  }

                  Thank you again!

                  Comment


                  • #10
                    Not quite. Your ": dir" macro function is looking in Stata's current working directory, which may not be the directory whose path is shown in the dir command. Below are two examples; note that I replaced the dir command with macro list files to precisely display the list of selected files in the log - which made it clear to me that the directory path is not included in the local macro entries.
                    Code:
                    local files : dir "C:\User\Desktop\Folder" files "*.txt"
                    macro list _files
                    foreach f of local files {
                        import delimited "C:\User\Desktop\Folder/`f'"...
                    }
                    or alternatively
                    Code:
                    cd "C:\User\Desktop\Folder"
                    local files : dir . files "*.txt"
                    macro list _files
                    foreach f of local files {
                        import delimited "`f'"...
                    }
                    Note particularly that in the first example, it is imperative that you use the forward slash, rather than backslash, after "Folder". The backslash character
                    Code:
                    \
                    has meaning to Stata: it will prevent the interpretation of any following character that has a special meaning to Stata, in particular
                    Code:
                    `
                    will not be interpreted as indicating a reference to a macro. But the forward slash can be used in file paths everywhere in Windows except on command lines (because there, it had been reserved from indicating options in MS-DOS before there were subdirectories).

                    Comment


                    • #11
                      Ok, it works! And what have I to do to save the file:

                      1. using the same file name, but different extension (from .txt to .dta)
                      2. using the same file name plus a different final part (e.g. original name AAA.txt new name AAA_v2.dta and AAA_v2.txt)?

                      Thank you again!

                      Comment


                      • #12

                        Code:
                        local newfile : subinstr local f ".txt" ".dta", all 
                        save "`newfile_v2"

                        Comment


                        • #13
                          Ok... Actually, I need to do these different steps:

                          1. Import the .txt
                          2. Save the file as .dta with the SAME name
                          3. Modify the file
                          (this doen't mind now)
                          4. Save the modified file as .dta with the original name+_v2 (obviously, _v2 is an example to identify the "version 2" of the file)
                          5. Save the modified file as .txt, with the same name of the modified .dta

                          Therefore:

                          local files : dir "C:\Users\Desktop\Folder" files "*.txt"
                          macro list _files
                          foreach f of local files {
                          import delimited "C:\Users\Desktop\Folder/`f'"
                          local f : subinstr local f ".txt" ".dta", all
                          save "`f"

                          (modifications)

                          save "`f_v2"
                          local f : subinstr local f ".dta" ".txt", all
                          save "`f_v2"
                          }

                          ...probably it is not correct...

                          Comment


                          • #14
                            Certainly not correct.

                            First of all, to substitute the value of the local macro f into a command, you use
                            Code:
                            "`f"
                            but you need
                            Code:
                            "`f'"
                            Beyond that lie other errors. The following may well do what you want.
                            Code:
                            local files : dir "C:\Users\Desktop\Folder" files "*.txt"
                            macro list _files
                            foreach f of local files {
                                import delimited "C:\Users\Desktop\Folder/`f'"
                                local f : subinstr local f ".txt" ".dta", all
                                save "`f'"
                            
                            (modifications)
                            
                                local f : subinstr local f ".dta" "_v2.dta", all
                                save "`f'"
                                local f : subinstr local f ".dta" ".txt", all 
                                export delimited "`f'"
                            }

                            Comment


                            • #15
                              Sorry; there was an unmatched quote in #12

                              Comment

                              Working...
                              X