Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Moran's I on more than 800 observations: Mata

    Hi all,

    I was advised to perform the following tests on a spatial regression:

    spwmatrix gecon latitude longitude, wname(weightsnames) wtype(inv) alpha(2) dband(0 50) eignvar(eignvar) row replace
    spatgsa var, w(weightsnames) moran
    reg var vars
    spatdiag, weights(weightsnames)


    Yet, my dataset is made of over 250,000 observations and the commands incur into the limit of 800 rows or columns for matrices in Stata.


    Specifically, when I try to create the matrix with all the observations, it says:
    J(): 3900 unable to allocate real <tmp>[257224,257224]
    spwmatrix_CalcSPweightM(): - function returned error
    <istmt>: - function returned error

    When I do it with less observations, for example 10,000, it does creates the matrix, but then, when running spatgsa, it returns another error saying:

    unable to allocate matrix;
    You have attempted to create a matrix with too many rows or columns or attempted to fit a model with too many variables.

    You are using Stata/BE which supports matrices with up to 800 rows or columns. See limits for how many more rows and columns Stata/SE and
    Stata/MP can support.

    If you are using factor variables and included an interaction that has lots of missing cells, try set emptycells drop to reduce the
    required matrix size; see help set emptycells.

    If you are using factor variables, you might have accidentally treated a continuous variable as a categorical, resulting in lots of
    categories. Use the c. operator on such variables.



    I thought I could try to create it manually in Mata and tried to set up the matrix:

    mata:
    X = st_data(., "X")
    Y = st_data(., "Y")
    n = rows(X)
    W = J(n, n, 0)
    end

    But it gave the error:
    J(): 3900 unable to allocate real <tmp>[257224,257224]
    <istmt>: - function returned error

    It doesn't give the same error when I reduce the number of observations, for example to 50,000.



    Does anyone have ideas on how to perform these tests on my dataset?

    Any help will be deeply appreciated!

  • #2
    You need more memory. See

    Code:
    help [M-1] limits
    Size approximations:
    Memory requirements
    ------------------------------------------------------------------
    real matrices oh + r*c*8
    complex matrices oh + r*c*16
    pointer matrices oh + r*c*8
    string matrices oh + r*c*8 + total_length_of_strings
    ------------------------------------------------------------------
    where r and c represent the number of rows and columns and where
    oh is overhead and is approximately 64 bytes


    Description

    Mata imposes limits, but those limits are of little importance compared with the memory requirements. Mata stores matrices in memory and requests the memory for them from the operating system.

    Also, as far as estimation is concerned, you'd probably struggle with edition BE.

    Comment


    • #3
      If your sample size is a quarter of a million, significance tests are fairly pointless even if you have random samples, or whatever else is being assumed.

      The problem is that you're asking for a matrix with 67 billion cells, give or take a few hundred million.

      Comment

      Working...
      X