Hello everyone
I have implemented a kd-tree search algorithm in Mata, that can find the k nearest neighbours of a p-dimensional point among a set of points. For large data sets, this can be much faster than a 'brute force' search, and it could be useful for researchers doing spatial analysis.
The code is available from my Github repository. Simply download and run the file mata_knn.do; this will intialize all the Mata functions. Example usage:
The matrices kni and knd contain the indices of, and distances to the k nearest points, for each query point. Of course, the query and the data points could be the same in which case the first nearest neighbour is always 'self'. Duplicate data_coords are not allowed, and will throw an error.
I have only thoroughly tested it with 2-dimensional points yet. If you feel that this is useful, or if you find any bugs, kindly let me know! I also consider uploading it to the SSC archive, but have not found the time to do so yet.
Best
Robert
I have implemented a kd-tree search algorithm in Mata, that can find the k nearest neighbours of a p-dimensional point among a set of points. For large data sets, this can be much faster than a 'brute force' search, and it could be useful for researchers doing spatial analysis.
The code is available from my Github repository. Simply download and run the file mata_knn.do; this will intialize all the Mata functions. Example usage:
Code:
version 15.1 mata: mata clear mata: mata set matastrict on run mata_knn.do mata: N = 10000 k = 5 query_coords = runiform(N,2) data_coords = runiform(N,2) knn(query_coords, data_coords, k, kni=., knd=.) end
I have only thoroughly tested it with 2-dimensional points yet. If you feel that this is useful, or if you find any bugs, kindly let me know! I also consider uploading it to the SSC archive, but have not found the time to do so yet.
Best
Robert