Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • preserve, restore and loops

    Dear Statlists,

    I have a big dataset that I’m trying to split in multiple much smaller datasets. For this I use a number of foreach loops in combination with the collapse and reshape commands.

    My code looks like this:
    Code:
    clear
    use country22
    
    *split dataset by setting (1=community; 2=health care)
    preserve
     foreach i of num 1/2 {
             keep if setting2 == `i'
             save Austria_setting2`i',
             restore, preserve
              }
    
    *split dataset by pathogen (in health care setting)
    clear  
    use Austria_setting22
    
    preserve
     foreach i of num 1/6 {
             keep if pathogen2 == `i'
             save AustriaHC_pathogen2`i', 
             restore, preserve
             } 
    
    ***  Create datasets for infection nodes (including resistant and susceptible infections)
    
    **E.Colie
    clear 
    use AustriaHC_pathogen21 
    
    collapse (sum) incidence_split  proportion rate_pop ///
    incidence_pop cases_pop cases_split , ///
    by(age year population_pop population_split) cw 
    sa "EColie_Austria_both_v0.dta", replace 
    
    *collapse sex
    clear 
    use EColie_Austria_both_v0.dta
    collapse (sum) cases_split population_split (first) population_pop, ///
    by(age year) cw 
    sa "EColie_Austria_both_v1.dta", replace
    
    *compute transition probabilities
    clear
    use EColie_Austria_both_v1.dta
    gen transprob= (1-exp(-(cases_split/population_split)*1))
    drop cases_split population_split population_pop
    sa "EColie_Austria_both_v2.dta", replace
    
    *export dataset
    clear 
    use EColie_Austria_both_v2.dta
    reshape wide transprob, i(age) j(year)
    export delimited using "EColie_Austria_both.csv", nolabel replace
    I repeat this routine several times, which results in a relatively long do file. The code provided above works fine when I run the foreach loops separately -- i.e. by selecting and running the first loop
    Code:
    clear
    use country22
    
    *split dataset by setting (1=community; 2=health care)
    preserve
     foreach i of num 1/2 {
             keep if setting2 == `i'
             save Austria_setting2`i',
             restore, preserve
              }
    And then by selecting and running the second loop
    Code:
    clear  
    use Austria_setting22
    
    preserve
     foreach i of num 1/6 {
             keep if pathogen2 == `i'
             save AustriaHC_pathogen2`i', 
             restore, preserve
             }
    But when I try to run the entire do file I get the following error message: “already preserved”

    I have a similar issue with the two foreach loops below
    Code:
    clear 
    use AustriaHC_pathogen21 
    
    *compute total number of cases by age and pathogen  
      foreach x of var age  {
            bysort year age: egen cases_pathogen = sum(cases_split) 
            }
    
    *Split by resistance
    preserve
      foreach i of num 1/9 {
            keep if resistant2 == `i' 
            save AustriaHC_pathogen21_resistant2`i',
            restore, preserve
            }
    Here, when I try to run the second loop without the preserve command, the "cases_pathogen" variable calculated in the first loop appears only in the first dataset (out of the 9) created by the second loop. But again, when I run the loops separably (as described above), the code works fine.

    I'd like to be able to run the entire do file without having to select and run different sections of the code.
    Thanks in advance for your help.

  • #2
    put the preserve and restore commands within each loop, e.g.;

    Code:
    clear
    use country22
    
    *split dataset by setting (1=community; 2=health care)
    *delete: preserve
     foreach i of num 1/2 {
        preserve
        keep if setting2 == `i'
        save Austria_setting2`i',
        restore
    }
    
    *split dataset by pathogen (in health care setting)
    clear  
    use Austria_setting22
    
    
     foreach i of num 1/6 {
        preserve
        keep if pathogen2 == `i'
        save AustriaHC_pathogen2`i',
        restore
    }
    In the original, the first loop ends with preserve, followed immediately by another preserve prior to the second loop.

    Comment


    • #3
      Worked perfectly.
      Many thanks Jorrit.

      Comment


      • #4
        I receive an error message: "file c:\kr\a`fe'.dta not found" when I am looping over different dataset files.
        The code is correct in my opinion. Is really? :

        global filepath "c:\kr\a"
        global filepath2 "e:\ern"
        global filename " "u23o" "u24o" "u25o" "u26o" "u27o" "
        global suffix ".dta"
        clear
        foreach fe in $filename {
        use "${filepath}`fe'${suffix} " ,clear
        keep q x gl
        sort q x
        egen t1 = tag(gl)
        keep if t1 == 1
        drop tag1
        save "${filepath2}`fe'${suffix} " , replace
        }


        Comment


        • #5
          Stata is not expanding the `fe' local macro in your -save- statement. The mixing of global and local macros like this is always a potential source of confusion. And since global macros are, in any case, dangerous and should never be used when a local macro or some other device will do, the best solution to your problem is to get rid of all the globals. I've also changed the backslashes to forward slashes in the pathnames: they work with Stata even in a Windows environment and avoid the problems that come from the juxtposition of \ and ` interfering with local macro interpretation. Finally I eliminated global suffix because it really serves no purpose to hold just that.

          Code:
          local filepath "c:/kr/a"
          local filepath2 "e:/ern"
          local filename " "u23o" "u24o" "u25o" "u26o" "u27o" "
          
          clear
          foreach fe of local filename {
          use "`filepath'`fe'.dta" ,clear
          keep q x gl
          sort q x
          egen t1 = tag(gl)
          keep if t1 == 1
          drop tag1
          save "`filepath2'`fe'.dta" , replace
          }

          Comment


          • #6
            Thanks so much!!

            Comment


            • #7
              You can further simplify; there's no need to specify ".dta" either, as this is the default file type when using or saving. There's also no need for quotes around individual elements of the local.

              Code:
              local filepath "c:/kr/a"
              local filepath2 "e:/ern"
              local filename "u23o u24o u25o u26o u27o"  
              clear
              foreach fe of local filename {
              use "`filepath'`fe'" ,clear
              keep q x gl
              sort q x
              egen t1 = tag(gl)
              keep if t1 == 1
              drop tag1
              save "`filepath2'`fe'" , replace
              }
              Last edited by Jorrit Gosens; 07 Mar 2018, 01:16.

              Comment

              Working...
              X