Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mata - large matrix - error r(3900)

    Hello everyone,

    I am trying to run a robustness check on a weighted index for 11 countries. The framework is composed of 5 dimensions and 14 indicators. So I created nested loops on specific ranges of weights (for dimensions and indicators), and generated 102,400 models (each model has 11 observations so the total is slightly higher than ~1M observations). [ I can attach the .dta file if needed]

    What I intend to do is:
    1. Rank countries by Model
      Code:
       egen Rank= rank(MPI), by(Model)
    2. Set the desired matrix size:
      Code:
      	//create a matrix of 1472x1472
      	set more off
      	mata: st_matrix("Ed", J(102400,102400,.))
    3. Compute the Euclidean Distances between pairs of models and store the sum of the difference in the relevant entry of the matrix (102400x102400). To this end, I generate 2 temporary variables for each pair of models, in a nested loop (code committed for simplicity, if more details needed please do refer to the attached .do file).
      Code:
      if (`a'!=`b'){
      	gen temp = (Model`a'-Model`b')^2
      	sum temp
      	mat Ed[`a',`b']=r(sum)^0.5
      	drop temp }
    4. Take the sum of rows of the matrix, to get a metric that gives an idea about the distance of that model and the remaining 102399 models, reducing my figures from (102400x102400) to a column of 102400 entries
      Code:
      	mata : st_matrix("S_Ed", rowsum(st_matrix("Ed")))
      	mat list S_Ed // this is the sum of rows of Ed
    5. The desired conclusion is to select the model that has the minimum figure.
      Code:
      	mata : st_matrix("Min_Ed", colmin(st_matrix("S_Ed"))) // get the minimum of Eds
      	mat list Min_Ed
    This worked perfectly for a smaller matrix size. But when I try to generate a matrix of size 102400x102400, at the beginning of the .do file :
    Code:
    mata: st_matrix("Ed", J(102400,102400,.))
    I am getting an r(3900) error:
    Code:
     J(): 3900 unable to allocate real <tmp>[102400,102400]  <istmt>: - function returned error
    Please note that I am using Windows 10 on a multi-core processor machine (64-bit operating system, x64 based processor) with 8.00 GB of installed RAM (7.88 GB useable, when I checked).
    Also, I am using Stata/MP 15.0 for Windows (64-bit x86-64)


    Finally, when I run memory after the error, I get the following:
    Memory usage
    used allocated
    data 175,718,400 268,435,456
    strLs 0 0
    data & strLs 175,718,400 268,435,456
    data & strLs 175,718,400 268,435,456
    var. names, %fmts, 2,309 35,021
    overhead 3,178,544 3,179,288
    Stata matrices 0 0
    ado-files 11,451 11,451
    stored results 0 0
    Mata matrices 1,904 1,904
    Mata functions 34,272 34,272
    set maxvar usage 5,281,738 5,281,738
    other 37,549 37,549
    grand total 184,232,883 277,016,679
    Any advice on how to overcome this r(3900) error and get going with my code?

    Thx!
    Attached Files
    Last edited by Sama Sleiman; 07 Sep 2020, 01:15.

  • #2
    I do not know why you are creating your matrix as a Stata matrix rather than a Mata matrix, but the output of help limits tells us that Stata matrices are limited to 65,534x65,534, while the output of help [M-1] limits tells us that Mata matrices are limited in size by the memory of your computer.

    With that said, a 102400x102400 matrix will have in excess of 10x109 cells each requiring 8 bytes of memory for a total memory requirement in excess of 80GB.

    Comment


    • #3
      Thx William for the feedback,
      So even if i used
      Code:
      mata: Ed =J(102400,102400,.)
      the problem will remain unresolved, as it is due to my non-super-computer!
      I think i should split the file into smaller ones

      Comment

      Working...
      X