Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rounding off district centroid gps coordinates to match rainfall data gps coordinates

    Hello,

    I have a question about rounding district centroid coordinate variables so I can merge it with the values of coordinate variables as given in the rainfall dataset. My latitude and longitude coordinates in the district centroid file have values like e.g., latitude == 19.2845 and longitude == 78.8132. On the other hand, the coordinate variables in the rainfall dataset look like the following image. It only contains coordinates as __.25 or __.75 (it doesn't include __.00 or __.50). I would like to round the values of district latitude and longitude vars to the nearest .25 or .75 so I can merge with the rainfall dataset using the coordinates. I'm not sure how to go about this. Please help. Thank you!

    Click image for larger version

Name:	s1.jpg
Views:	1
Size:	254.3 KB
ID:	1658981

  • #2
    Any kind of exact merging/matching based on decimal ("floating point") key values is something I'd avoid. There are a lot of ways to go wrong there that are related to the precision with which computers store decimal values, as you may well know. I suppose you could do things as you want by creating integer versions of your lat/long in both data sets (e.g., 6825 rather than 68.25, truncating or rounding as desired) and merging with those as the key, but I wouldn't advise that.

    When dealing with situations like yours, I have instead used the community-contributed program -geonear- (-ssc describe geonear-) which can be used in your situation to find, for each observation in your master file, the observation(s) from the rainfall file that is closest to it.

    Had you supplied a data example using the -dataex- command, as described in the FAQ for new participants on StataList, I'd probably have been able to supply code to illustrate doing what you want. However, I don't do this kind of thing often enough to remember how to do it without an example to work with. If you want to supply example data for your "master file" and your "rainfile file," I'd give it a look. Without that, perhaps someone else will jump in or have another approach, but having example data always makes people much more likely to help. And, per the FAQ, screenshots are not convenient for this purpose, so you'll want to use -dataex-.

    Comment


    • #3
      Thanks for the tip, Mike! I checked geonear but I'm not sure how to use it for merging with the rainfall dataset in my case.
      Here's an example dataset. I would like to round columns lat_11 and long_11 to match the format for variables latitude and longitude respectively or somehow merge the file containing coordinate vars lat_11 and long_11 with the file containing coordinate vars latitude and longitude. You'll notice that the variables latitude and longitude contain coordinates as __.25 or __.75 (it doesn't include __.00 or __.50). Any help will be appreciated. Thank you so much!

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input double(lat_11 long_11 longitude latitude)
      23.48057203 68.82192301 68.25 23.75
      22.41060041 69.62062006 68.25 24.25
      21.55092352 69.71896988 68.25 24.75
      20.92952301  70.5473389 68.25 25.25
      22.40603378 70.83679439 68.25 25.75
          20.7144     70.9874 68.25 26.25
      26.86900839 71.18371877 68.25 26.75
      end

      Comment


      • #4
        Do you have a formula for the rounding? For example, is 21.26 rounded up to 21.75 or rounded down to 21.25? Assuming you always round up such that 21.76 becomes 22.25, create all string identifiers:



        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input double(lat_11 long_11 longitude latitude)
        23.48057203 68.82192301 68.25 23.75
        22.41060041 69.62062006 68.25 24.25
        21.55092352 69.71896988 68.25 24.75
        20.92952301  70.5473389 68.25 25.25
        22.40603378 70.83679439 68.25 25.75
            20.7144     70.9874 68.25 26.25
        26.86900839 71.18371877 68.25 26.75
        end
        
        preserve
        *RAINFALL DATASET
        keep lati longi
        gen strlat= string(lat)
        gen strlon= string(lon)
        list, sep(0)
        
        restore
        *CURRENT DATASET
        keep *_11
        gen strlat= string(int(lat)+ cond(inrange((lat-int(lat))*100, 0, 25),.25,cond(inrange((lat-int(lat))*100, 26, 75),.75, 1.25)))
        gen strlon= string(int(lon)+ cond(inrange((lon-int(lon))*100, 0, 25),.25,cond(inrange((lon-int(lon))*100, 26, 75),.75, 1.25)))
        l, sep(0)
        Res.:

        Code:
        *RAINFALL DATASET
        
        . list, sep(0)
        
             +---------------------------------------+
             | longit~e   latitude   strlat   strlon |
             |---------------------------------------|
          1. |    68.25      23.75    23.75    68.25 |
          2. |    68.25      24.25    24.25    68.25 |
          3. |    68.25      24.75    24.75    68.25 |
          4. |    68.25      25.25    25.25    68.25 |
          5. |    68.25      25.75    25.75    68.25 |
          6. |    68.25      26.25    26.25    68.25 |
          7. |    68.25      26.75    26.75    68.25 |
             +---------------------------------------+
        
        .
        .
        .
        . restore
        
        .
        . *CURRENT DATASET
        
        . l, sep(0)
        
             +-----------------------------------------+
             |    lat_11     long_11   strlat   strlon |
             |-----------------------------------------|
          1. | 23.480572   68.821923    23.75    69.25 |
          2. |   22.4106    69.62062    22.75    69.75 |
          3. | 21.550924    69.71897    21.75    69.75 |
          4. | 20.929523   70.547339    21.25    70.75 |
          5. | 22.406034   70.836794    22.75    71.25 |
          6. |   20.7144     70.9874    20.75    71.25 |
          7. | 26.869008   71.183719    27.25    71.25 |
             +-----------------------------------------+
        
        .

        Otherwise, if you cannot determine the rounding formula, as Mike suggests, you will need to rely on the closest coordinate match.
        Last edited by Andrew Musau; 11 Apr 2022, 19:53.

        Comment


        • #5
          You do NOT want to round off anything with spatial data. The command you're interested in, if I understand your issue well, is Robert Picard's geoinpoly so you can just match rainfall data directly with the coordinates.

          Comment


          • #6
            Thank you all for your help! I took your advice and went with geoinpoly to merge the district centroid file with the rainfall data.
            Here's a sample code for anyone that might need help in the future.

            Code:
            #convert shapefile (.shp) to .dta file
            
            shp2dta using "2022_dist.shp", genid(_ID) data("dist22.dta") coor("dists22_coor.dta") replace
            #2022_dist.shp: district centroid shapefile; 2022_dist.dbf must be in the same folder as 2022_dist.shp
            
            use precip_panel
            # _Y: latitude, _X: longitude
            
            geoinpoly _Y _X using "dists22_coor.dta"
            
            merge m:1 _ID using "dist22.dta", keep(master match) nogen
            Last edited by Ann James; 12 Apr 2022, 02:58.

            Comment

            Working...
            X