Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make my code run faster

    Dear Members,
    Kindly requesting on how to make my below code run faster.
    I have code for bootstrating from imputed data.Thanks to Felix Bittmann paper (https://www.preprints.org/manuscript/202401.0813/v1).My problem is that my code is taking too long running for (5days).
    Anyone with idea on how I can make it faster will be of great help to me.
    Regards,
    Fred

    ****Trying CI for imputed dataset

    ****calculating PAF and its CI
    *use data/ONLYpafdta,clear
    * Define a program to calculate PAF
    cd "/Users/fodiwuor/Library/CloudStorage/OneDrive-KemriWellcomeTrust/fodiwuor/studies/AASRF_projects"
    capture log close
    log using "/Users/fodiwuor/Library/CloudStorage/OneDrive-KemriWellcomeTrust/fodiwuor/studies/AASRF_projects/CARRIAGE AND SYTEMATICREVIEW/dofile/simulatePAF.txt",text replace
    use data/temp/ONLYpafdta,clear
    capture program drop calc_paf
    program define calc_paf, rclass
    use data/temp/ONLYpafdta,clear
    *version 17.0
    *syntax varlist [,if] [,in]
    *args ipdc hiv year agecat ipdoutc pop
    ***unser5
    bsample, cluster(idcode) idcluster(newid)
    *local j=1
    **under5
    local prop_under5=0
    local prop_under5RR=0
    forval fred=1/20{
    nbreg ipd_count hiv_status year if agecat1==1 & _mi_m==`fred', exp(midyrpopdx) difficult
    *local prevRR_`fred'_1=_b[hiv_status]
    local prop_under5RR=`prop_under5RR'+_b[hiv_status]

    proportion hiv_status if agecat1==1 & !missing(ipd_outtcome) & _mi_m==`fred'
    *local prev_`fred'_1=r(table)[1,2]
    local prop_under5=`prop_under5'+r(table)[1,2]
    }
    local prop_under5=round(`prop_under5'/20,0.01)
    local prop_under5RR=round(exp(`prop_under5RR'/20),0.01)


    **5to14
    local prop_5to14=0
    local prop_5to14RR=0
    forval fred=1/20{
    nbreg ipd_count hiv_status year if agecat1==2 & _mi_m==`fred', exp(midyrpopdx) difficult
    *local prevRR_`fred'_2=_b[hiv_status]
    local prop_5to14RR=`prop_5to14RR'+_b[hiv_status]

    proportion hiv_status if agecat1==2 & !missing(ipd_outtcome) & _mi_m==`fred'
    *local prev_`fred'_2=r(table)[1,2]
    local prop_5to14=`prop_5to14'+r(table)[1,2]
    }
    local prop_5to14=round(`prop_5to14'/20,0.01)
    local prop_5to14RR=round(exp(`prop_5to14RR'/20),0.01)

    **15+
    local prop_5Above=0
    local prop_5AboveRR=0
    forval fred=1/20{
    nbreg ipd_count hiv_status year if agecat1==3 & _mi_m==`fred', exp(midyrpopdx) difficult
    *local prevRR_`fred'_3=_b[hiv_status]
    local prop_5AboveRR=`prop_5AboveRR'+_b[hiv_status]

    proportion hiv_status if agecat1==3 & !missing(ipd_outtcome) & _mi_m==`fred'
    *local prev_`fred'_3=r(table)[1,2]
    local prop_5Above=`prop_5Above'+r(table)[1,2]
    }
    local prop_5Above=round(`prop_5Above'/20,0.01)
    local prop_5AboveRR=round(exp(`prop_5AboveRR'/20),0.01)

    **proportion
    *local prop_under5=round((`prev_1_1'+`prev_2_1'+`prev_3_1 '+`prev_4_1'+`prev_5_1'+`prev_6_1'+`prev_7_1'+`pre v_8_1'+`prev_9_1'+`prev_10_1'+`prev_11_1'+`prev_12 _1' ///
    *+`prev_13_1'+`prev_14_1'+`prev_15_1'+`prev_16_1'+ `prev_17_1'+`prev_18_1'+`prev_19_1'+`prev_20_1')/20,0.01)
    *di "`prop_under5'"
    *ren prop_under5 under5prop

    *local prop_5to14=round((`prev_1_2'+`prev_2_2'+`prev_3_2' +`prev_4_2'+`prev_5_2'+`prev_6_2'+`prev_7_2'+`prev _8_2'+`prev_9_2'+`prev_10_2'+`prev_11_2'+`prev_12_ 2' ///
    *+`prev_13_2'+`prev_14_2'+`prev_15_2'+`prev_16_2'+ `prev_17_2'+`prev_18_2'+`prev_19_2'+`prev_20_2')/20,0.01)
    *di "`prop_5to14'"

    *local prop_5Above=round((`prev_1_3'+`prev_2_3'+`prev_3_3 '+`prev_4_3'+`prev_5_3'+`prev_6_3'+`prev_7_3'+`pre v_8_3'+`prev_9_3'+`prev_10_3'+`prev_11_3'+`prev_12 _3' ///
    *+`prev_13_3'+`prev_14_3'+`prev_15_3'+`prev_16_3'+ `prev_17_3'+`prev_18_3'+`prev_19_3'+`prev_20_3')/20,0.01)
    *di "`prop_5Above'"
    ***relative risk
    *local prop_under5RR=round(exp((`prevRR_1_1'+`prevRR_2_1' +`prevRR_3_1'+`prevRR_4_1'+`prevRR_5_1'+`prevRR_6_ 1'+`prevRR_7_1'+`prevRR_8_1'+`prevRR_9_1'+`prevRR_ 10_1'+`prevRR_11_1'+`prevRR_12_1' ///
    *+`prevRR_13_1'+`prevRR_14_1'+`prevRR_15_1'+`prevR R_16_1'+`prevRR_17_1'+`prevRR_18_1'+`prevRR_19_1'+ `prevRR_20_1')/20),0.01)
    *di "`prop_under5RR'"
    *ren prop_under5RR RR_under5

    *local prop_5to14RR=round(exp((`prevRR_1_2'+`prevRR_2_2'+ `prevRR_3_2'+`prevRR_4_2'+`prevRR_5_2'+`prevRR_6_2 '+`prevRR_7_2'+`prevRR_8_2'+`prevRR_9_2'+`prevRR_1 0_2'+`prevRR_11_2'+`prevRR_12_2' ///
    *+`prevRR_13_2'+`prevRR_14_2'+`prevRR_15_2'+`prevR R_16_2'+`prevRR_17_2'+`prevRR_18_2'+`prevRR_19_2'+ `prevRR_20_2')/20),0.01)
    *di "`prop_5to14RR'"

    *local prop_5AboveRR=round(exp((`prevRR_1_3'+`prevRR_2_3' +`prevRR_3_3'+`prevRR_4_3'+`prevRR_5_3'+`prevRR_6_ 3'+`prevRR_7_3'+`prevRR_8_3'+`prevRR_9_3'+`prevRR_ 10_3'+`prevRR_11_3'+`prevRR_12_3' ///
    *+`prevRR_13_3'+`prevRR_14_3'+`prevRR_15_3'+`prevR R_16_3'+`prevRR_17_3'+`prevRR_18_3'+`prevRR_19_3'+ `prevRR_20_3')/20),0.01)
    *di "`prop_5AboveRR'"


    // Calculate PAF
    local paf =round((`prop_under5'*(`prop_under5RR'-1))/(`prop_under5RR'),0.01)
    return scalar pafunde5 = `paf'

    ***5to14
    local pafx =round((`prop_5to14'*(`prop_5to14RR'-1))/(`prop_5to14RR'),0.01)
    return scalar pafunde5to14 =`pafx'

    ***15 and above
    local pafxx =round((`prop_5Above'*(`prop_5AboveRR'-1))/(`prop_5AboveRR'),0.01)
    return scalar pafunde15plus=`pafxx'

    ***proportion
    di "`prop_under5'"
    di "`prop_5to14'"
    di "`prop_5Above'"

    ***RR
    di "`prop_under5RR'"
    di "`prop_5to14RR'"
    di "`prop_5AboveRR'"
    end
    **boot strap
    *bootstrap r(pafunde5), reps(1000) nodrop: calc_paf
    ***Amm just so interested in CI
    *simulate result=r(pafunde5) ,reps(1000) seed (123) dots:calc_paf
    *centile result , centile(2.5 97.5)


    local seeds 123 456 789 101112
    *parallel setclusters 4,statapath(/Applications/Stata/StataBE.app/Contents/MacOS/StataBE)
    parallel initialize 4,statapath(/Applications/Stata/StataBE.app/Contents/MacOS/StataBE)
    parallel sim,expr(pafunde5=r(pafunde5) pafunde5to14=r(pafunde5to14) pafunde15plus=r(pafunde15plus)) reps(1000) seed(`seeds') noisily trace saving("Data/res_imputeboot", replace): calc_paf
    parallel viewlog 4
    centile pafunde5 , centile(2.5 97.5)
    centile pafunde5to14, centile(2.5 97.5)
    centile pafunde15plus, centile(2.5 97.5
    centile result, centile(2.5 97.5)
    sum *, det

  • #2
    Some suggestions that could apply to any -bootstrap- job, in increasing order of difficulty. I think dividing the bootstrap into 20 separate Stata jobs will be the key.See https://www.nber.org/stata/efficient/ for other suggestions for dealing with big or long-running Stata jobs.
    Last edited by Daniel Feenberg; 24 Jun 2024, 09:01.

    Comment


    • #3
      Beyond Daniel's useful suggestions, here are some other ideas: I'd try Stata's -profiler- command to see which aspects are taking the most time. Similar or even better information can be obtained by using several instances of the -timer- command within your code. Both of these can be useful in addition to or instead of -set msg on-.

      A suggestion regarding posting: You're asking for help with a large chunk of code, over 100 lines. In doing that, taking special care to make it easy to read your code would increase the chances that someone would want to try to help you. Using code delimiters is one thing that would help, but I'd also suggest using conventional indentation practices, avoiding lines that split on the screen, and removing lines that you have commented out with "*".

      Comment

      Working...
      X