Hello everyone,
I already posted a question pretty close to that one several months ago. I have a mata program that computes basic but heavy linear algebra tasks. It basically checks whether the datatset has more than 10,000 obs and if so, it splits the dataset into subgroups of several thousands to make matrix sizes smaller and limit memory allocation failures. Once the dataset has been split, the program does the required computation. The program is part of a larger ado file. I am programming using Stata MP 13 and my package works perfectly fine even with large datasets (100000 obs) on that version of Stata. However, I tested my program on Stata 12 SE and the external memory allocation failed. I did not use any pointer in the code below. Would it allow me to overcome external memory allocation issues? If so, would it slow down my program significantly or not?
Thanks in advance,
Yannick Guyonvarch
I already posted a question pretty close to that one several months ago. I have a mata program that computes basic but heavy linear algebra tasks. It basically checks whether the datatset has more than 10,000 obs and if so, it splits the dataset into subgroups of several thousands to make matrix sizes smaller and limit memory allocation failures. Once the dataset has been split, the program does the required computation. The program is part of a larger ado file. I am programming using Stata MP 13 and my package works perfectly fine even with large datasets (100000 obs) on that version of Stata. However, I tested my program on Stata 12 SE and the external memory allocation failed. I did not use any pointer in the code below. Would it allow me to overcome external memory allocation issues? If so, would it slow down my program significantly or not?
Thanks in advance,
Yannick Guyonvarch
Code:
/*CIC analytic variance*/ version 12 mata: mata set matastrict on void intermediary_CIC_variance_step(string scalar one_over_fd, /// string scalar y, string scalar qd, /// string scalar isId10, /// string scalar n, string scalar nd10, /// string scalar touse, string scalar build3_d, /// string scalar build4_d, string scalar split_grid) { real matrix inv_fd,Y,Yd10,Qd,split_matd,split_matd10,res,subsetY,subsetQd,v1,v2,v3,v4,v5,v6,v7,temp_res real scalar N,Nd10,counterd10,counterd,floord10,floord, /// i,j,split_gr real scalar coucou coucou=1 inv_fd=Y=Yd10=Qd=split_matd=split_matd10=res=temp_res=. st_view(inv_fd,.,one_over_fd,isId10) st_view(Y,.,y,touse) st_view(Yd10,.,y,isId10) st_view(Qd,.,qd,touse) st_view(res,.,st_addvar(("double","double"),(build3_d,build4_d)),touse) Nd10=st_numscalar(nd10) N=st_numscalar(n) split_gr=st_numscalar(split_grid) res[.,.]=J(N,2,0) if(Nd10>=10000 & N>=10000) { floord10=floor(Nd10/split_gr) if(floord10*split_gr>Nd10) { counterd10=floord10 } else if(floord10*split_gr==Nd10) { counterd10=floord10 } else { counterd10=floord10+1 } split_matd10=J(1,counterd10,0) floord=floor(N/split_gr) if(floord*split_gr>N) { counterd=floord } else if(floord*split_gr==N) { counterd=floord } else { counterd=floord+1 } split_matd=J(1,counterd,0) for(i=1;i<=counterd10;i++) { if(i!=counterd10){ split_matd10[1,i]=split_gr*i } else { split_matd10[1,i]=Nd10 } } for(i=1;i<=counterd;i++) { if(i!=counterd){ split_matd[1,i]=split_gr*i } else { split_matd[1,i]=N } } } else if (Nd10<10000 & N>=10000) { floord=floor(N/split_gr) if(floord*split_gr>N) { counterd=floord } else if(floord*split_gr==N) { counterd=floord } else { counterd=floord+1 } split_matd=J(1,counterd,0) counterd10=1 split_matd10=J(1,1,Nd10) for(i=1;i<=counterd;i++) { if(i!=counterd){ split_matd[1,i]=split_gr*i } else { split_matd[1,i]=N } } } else { counterd=1 counterd10=1 split_matd10=J(1,1,Nd10) split_matd=J(1,1,N) } for(i=1;i<=counterd;i++) { subsetY=subsetQd=v1=. if(i==1) { subsetY=Y[(1::split_matd[1,i]),1] subsetQd=Qd[(1::split_matd[1,i]),1] v1=J(split_matd[1,i],1,1) } else { subsetY=Y[(split_matd[1,i-1]+1::split_matd[1,i]),1] subsetQd=Qd[(split_matd[1,i-1]+1::split_matd[1,i]),1] v1=J(split_matd[1,i]-split_matd[1,i-1],1,1) } for(j=1;j<=counterd10;j++) { v2=v3=v4=v5=v6=v7=. if(j==1) { v2=subsetY#J(1,split_matd10[1,j],1) v3=v1#Yd10[(1::split_matd10[1,j]),1]' v4=v1#inv_fd[(1::split_matd10[1,j]),1]' v5=subsetQd#J(1,split_matd10[1,j],1) } else { v2=subsetY#J(1,split_matd10[1,j]-split_matd10[1,j-1],1) v3=v1#Yd10[(split_matd10[1,j-1]+1::split_matd10[1,j]),1]' v4=v1#inv_fd[(split_matd10[1,j-1]+1::split_matd10[1,j]),1]' v5=subsetQd#J(1,split_matd10[1,j]-split_matd10[1,j-1],1) } v6=((v3-v2):>=0) v7=((v3-v5):>=0) if(i==1) { res[(1::split_matd[1,i]),1]=res[(1::split_matd[1,i]),1]+rowsum(v7:*v4) res[(1::split_matd[1,i]),2]=res[(1::split_matd[1,i]),2]+rowsum(v6:*v4) } else { res[(split_matd[1,i-1]+1::split_matd[1,i]),1]=res[(split_matd[1,i-1]+1::split_matd[1,i]),1]+rowsum(v7:*v4) res[(split_matd[1,i-1]+1::split_matd[1,i]),2]=res[(split_matd[1,i-1]+1::split_matd[1,i]),2]+rowsum(v6:*v4) } } } res[.,.]=res:/Nd10 mean(res) } end
Comment