Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Efficiency of -spmatrix create- in Stata 15

    Hello,

    I have a spatial dataset with about 23,000 observations and I am trying to use the new spmatrix command in Stata 15 MP to create an inverse distance weighting matrix for use in spatial regressions. The following command has been running for over a day:
    Code:
    spset id, coord(cent_long cent_lat) coordsys(latlong, kilometers)
    spmatrix create idistance pirate_mat, vtruncate(1/1000)
    I was very surprised by this because, using this exact same dataset, the user-written command spmat idistance has taken just 204 seconds in Stata 14 MP to make a matrix. The code is as follows.
    Code:
    spmat idistance pirate_mat cent_long cent_lat, id(id) dfunction(dhaversine) vtruncate(1/1000) replace
    My questions are:

    1. Is there anything I can do to improve the speed of the spmatrix create command? Furthermore, can anyone clarify why spmatrix create takes so much longer than spmat? Is it something about how the matrix is being saved? Or is it the distance formula that is being used? If something very computationally intensive is going on (vincenty ellipsoid, etc.), I would love to have the option to use simple haversine distances instead so that things run faster.

    2. If there is no way to make spmatrix create faster, I was hoping to be able to save the matrix that I produced with spmat and import it using the spmatrix command. After running the spmat command above, I saved the matrix, then tried to import it using spmatrix.
    Code:
    spmat export pirate_mat using "Output Data/pirate_mat.txt", replace
    spmatrix import pirate_mat using "Output Data/pirate_mat.txt", replace
    I received the following error:
    incorrect number of columns in row 1 of weighting matrix
    error reading file Output Data/pirate_mat.txt

    I also tried the same thing again, saving the spmat matrix with no ids in case the id column was screwing things up.
    Code:
    spmat export pirate_mat using "Output Data/pirate_mat_noid.txt", replace noid
    spmatrix import pirate_mat using "Output Data/pirate_mat_noid.txt", replace
    I got the same error using this method.

    I was surprised to see the "incorrect number of columns" because this matrix definitionally has exactly as many columns as there are observations in the dataset (which I confirmed using spmat summarize). Can anyone explain how to resolve this error? I was thinking it might have something to do with the fact that spmat export saves a space-delimited file-- but it appears as though spmatrix export also creates a space-delimited file which spmatrix import can read. Is there a different way to export the spmat matrix that can be read by spmatrix import?

    I was really hoping to use the new spxtregress and (please correct me if I'm wrong) I don't think I can do that unless I use spmatrix to declare the weights, so any advice would be greatly appreciated. I am using Stata 15 MP. Thanks so much for your help.

    Best,
    Brina

  • #2
    Hi Brina,

    Can you attach your dataset so I can give it a test ? Thank you.

    Di

    Comment


    • #3
      Hi Di,

      I don't seem to be able to attach a .dta file to this post, so I'm using the dataex command recommended in the FAQ to provide 100 observations. However, I believe it might be important to share my whole dataset, given that I am running into this issue with a very large number of observations (everything works quickly on the small subsample). I would be happy to send over my data via email, Dropbox, etc. if that's easier-- just let me know.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int id float(cent_lat cent_long)
      20791  22.5  91.5
      14943   6.5   3.5
      14227   4.5   7.5
      10299  -6.5  39.5
      17509  13.5  49.5
       8022 -12.5 -77.5
      10366  -6.5 106.5
      17508  13.5  48.5
      13958   3.5  98.5
      13244   1.5 104.5
      13243   1.5 103.5
      17510  13.5  50.5
      12176  -1.5 116.5
      16487  10.5 107.5
      14226   4.5   6.5
      17507  13.5  47.5
      13545   2.5  45.5
      23268  29.5  48.5
      17145  12.5  45.5
      13965   3.5 105.5
      17148  12.5  48.5
      17940  14.5 120.5
      20431  21.5  91.5
      17147  12.5  47.5
      14585   5.5   5.5
      17146  12.5  46.5
      16315  10.5 -64.5
      14579   5.5   -.5
      19187  18.5 -72.5
      17871  14.5  51.5
      10726  -5.5 106.5
      11456  -3.5 116.5
      10368  -6.5 108.5
      17870  14.5  50.5
      14584   5.5   4.5
      17869  14.5  49.5
      18822  17.5 -77.5
      17143  12.5  43.5
      12897    .5 117.5
      13604   2.5 104.5
       3733 -24.5 -46.5
      10725  -5.5 105.5
      14698   5.5 118.5
      13866   3.5   6.5
      13241   1.5 101.5
      17540  13.5  80.5
      14268   4.5  48.5
      21130  23.5  70.5
      16096   9.5  76.5
       9553  -8.5  13.5
      10633  -5.5  13.5
      17144  12.5  44.5
      16005   9.5 -14.5
      16006   9.5 -13.5
      18823  17.5 -76.5
      13605   2.5 105.5
      14228   4.5   8.5
      17149  12.5  49.5
      14941   6.5   1.5
      13184   1.5  44.5
      14575   5.5  -4.5
      16100   9.5  80.5
      16316  10.5 -63.5
      20428  21.5  88.5
      15646   8.5 -13.5
      16318  10.5 -61.5
      12177  -1.5 117.5
      20770  22.5  70.5
      14319   4.5  99.5
      11807  -2.5 107.5
      13869   3.5   9.5
      15039   6.5  99.5
      13546   2.5  46.5
      14229   4.5   9.5
      17398  13.5 -61.5
      12462   -.5  42.5
      14267   4.5  47.5
      17516  13.5  56.5
      11454  -3.5 114.5
      17511  13.5  51.5
      14225   4.5   5.5
      17151  12.5  51.5
      17150  12.5  50.5
      13908   3.5  48.5
      13190   1.5  50.5
      14318   4.5  98.5
      17506  13.5  46.5
      14580   5.5    .5
      12831    .5  51.5
      17874  14.5  54.5
      13972   3.5 112.5
      14339   4.5 119.5
      16789  11.5  49.5
      18622  16.5  82.5
      13551   2.5  51.5
      12824    .5  44.5
      13960   3.5 100.5
      15241   7.5 -58.5
       4093 -23.5 -46.5
      20086  20.5 106.5
      end
      Thanks,
      Brina

      Comment


      • #4
        Hi Brina,

        Could you send the dataset to Dropbox, and email the Dropbox location to [email protected]?
        Thanks,

        Di

        Comment


        • #5
          Hi Brina, I am going through a similar workflow and encountering the same issue when trying to import a large spmat .txt file into spmatrix. Were you ever able to find a work-around?

          Comment


          • #6
            Hi Chris,

            Unfortunately, I was not-- though the spmatrix command did finally run on my machine after four days (!!!) of waiting. My colleague who has more computing power ended up running the code for this project. We ultimately re-created all our weighting matricies using spmatrix on his faster computer instead of finding a way to read in the saved spmat matrices.

            For anyone else encountering this post, Di suggested off-line that the reason spmatrix is less efficient than spmat is that spmatrix automatically computes eigenvalues while spmat does not. However, I'm a bit unsatisfied with this answer, because even using the eigenvalues option for spmat, spmat is much much much faster than spmatrix on my computer. I would love to hear any further thoughts on this issue.

            Best,
            Brina

            Comment


            • #7
              Is there any news on this? I have run into the same issue.

              Comment


              • #8
                I have run into the same issue too. I cannot import the matrix file made by Geoda (.gal) to Stata. Did you deal with this successful?

                Comment

                Working...
                X