Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata MP low CPU Usage

    I have access to Stata MP on an 8 core system with 16 gigs of ram. I am running this code below to clean up string names (this is just a snippet). When I check task manager, it is only using 1 core (total cpu usage at 12%) and is taking forever to run (each iteration takes more than an hour). The dataset is 6 gigs. I checked creturn list and it says the number of processors is 8.

    Are there certain commands where Stata will only use one core? Is there anything I can do to make Stata use more cores? Drastically changing the code or splitting up the datasets are last resorts in my case.

    Code:
    gen no1=regexr(name," [A-Z] "," ")
    foreach v of var no1{
        local more 1
        while `more' {
            clonevar old2=no1
            replace no1=regexr(no1," [A-Z]$"," ")
            replace no1=regexr(no1," [A-Z]\. "," ")
            replace no1=regexr(no1," [A-Z]\.$"," ")
            replace no1=regexr(no1," [A-Z][A-Z]\.$"," ")
            replace no1=trim(no1)
            replace no1=regexr(no1," [A-Z]\.[A-Z]\.$"," ")
            replace no1=subinstr(no1,"~","",.)
            replace no1=regexr(no1," Jr$"," ")
            replace no1=regexr(no1," Esq\.$"," ")
            replace no1=trim(no1)
            replace no1=regexr(no1," [A-Z]\.[A-Z]\.$"," ")
            count if old2 !=no1
            local more = r(N)
            drop old2
        }
    }

  • #2
    Normally replace is fully parallelised (see the MP report). Have you tried running it with the profiler on to see what is slowing down the program?

    PS: There is a -parallel- package that might help out here.

    Comment


    • #3
      I used parallel and all cores are being utilized now.

      Comment

      Working...
      X