Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • geodist - calculating distance

    Hi Statalists,

    I have data with id (i.e. addresses), year (i.e. 2006, 2007 and 2008), devision (i.e. from 1 to 9), latitude and longitude. I would like to calculate as follows:

    - For a given year, for each address, I would like to calculate the sum distance between that address to other addresses but conditioning they are in the same devision (for example with id_000361105, calculate distances from id_000361105 to id_001547108, from id_000361105 to id_002896207, from id_000361105 to id_013078100, since these ids are in the same devision 3, then sum the distances up and divided by the number of pairs to obtain the average distance for id_000361105).

    I have read some people that showed the code for geodist command, but it is a bit complicated for my case (i.e. geodist within groups with conditions). Can anyone help me out with this please?

    I really appreciate your help.

    Thank you


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 id double year byte devision float(latitude longitude)
    "000361105" 2006 3   41.9663  -87.98085
    "000361105" 2007 3   41.9663  -87.98085
    "000361105" 2008 3   41.9663  -87.98085
    "000886309" 2006 4  44.86384  -93.43001
    "000886309" 2007 4  44.86384  -93.43001
    "000886309" 2008 4  44.86384  -93.43001
    "000957100" 2006 2  40.70963  -74.01297
    "000957100" 2007 2  40.70963  -74.01297
    "000957100" 2008 2  40.70963  -74.01297
    "001084102" 2006 5   33.9743  -84.14539
    "001084102" 2007 5   33.9743  -84.14539
    "001084102" 2008 5   33.9743  -84.14539
    "00130H105" 2006 5  38.87373  -77.11727
    "00130H105" 2007 5  38.87373  -77.11727
    "00130H105" 2008 5  38.87373  -77.11727
    "001547108" 2008 3  39.34669  -84.41348
    "001744101" 2006 9 32.945763 -117.21492
    "001744101" 2007 9 32.945763 -117.21492
    "001744101" 2008 9 32.945763 -117.21492
    "00206R102" 2006 7  32.77925  -96.80477
    "00206R102" 2007 7  32.77925  -96.80477
    "00206R102" 2008 7  32.77925  -96.80477
    "002567105" 2008 9 37.603153 -122.01864
    "002824100" 2006 3  42.32403   -87.8567
    "002824100" 2007 3  42.32403   -87.8567
    "002824100" 2008 3  42.32403   -87.8567
    "002896207" 2006 3  40.08307   -82.7989
    "002896207" 2007 3  40.08307   -82.7989
    "002896207" 2008 3  40.08307   -82.7989
    "004498101" 2006 5 26.190046  -81.76478
    "004498101" 2007 5 26.190046  -81.76478
    "004498101" 2008 5 26.190046  -81.76478
    "00508Y102" 2006 5 33.799896   -84.3858
    "00508Y102" 2007 5 33.799896   -84.3858
    "00508Y102" 2008 5 33.799896   -84.3858
    "00724F101" 2006 9  37.34652 -121.90983
    "00724F101" 2007 9  37.34652 -121.90983
    "00724F101" 2008 9  37.34652 -121.90983
    "00738A106" 2006 6   34.7608  -86.68575
    "00738A106" 2007 6   34.7608  -86.68575
    "00738A106" 2008 6   34.7608  -86.68575
    "00751Y106" 2006 5 35.822914  -78.56207
    "00751Y106" 2007 5 35.822914  -78.56207
    "00751Y106" 2008 5 35.822914  -78.56207
    "00770F104" 2006 4  38.64421  -90.65141
    "00770F104" 2007 4  38.64421  -90.65141
    "00770F104" 2008 4  38.64421  -90.65141
    "007800105" 2006 9  33.91714 -118.40427
    "007800105" 2007 9  33.91714 -118.40427
    "007800105" 2008 9  33.91714 -118.40427
    "007903107" 2006 9  37.39349 -121.96467
    "007903107" 2007 9  37.39349 -121.96467
    "007903107" 2008 9  37.39349 -121.96467
    "007973100" 2006 8  40.52986 -105.03664
    "007973100" 2007 8  40.52986 -105.03664
    "007973100" 2008 8  40.52986 -105.03664
    "008190100" 2006 2  40.77724 -74.392944
    "008190100" 2007 2  40.77724 -74.392944
    "008190100" 2008 2  40.77724 -74.392944
    "00846U101" 2006 9  37.34864 -121.98438
    "00846U101" 2007 9  37.34864 -121.98438
    "00846U101" 2008 9  37.34864 -121.98438
    "009158106" 2006 2  40.58427  -75.62479
    "009158106" 2007 2  40.58427  -75.62479
    "009158106" 2008 2  40.58427  -75.62479
    "009363102" 2006 2  40.06187  -75.40246
    "009363102" 2007 2  40.06187  -75.40246
    "009363102" 2008 2  40.06187  -75.40246
    "00971T101" 2007 1    42.362  -71.08159
    "00971T101" 2008 1    42.362  -71.08159
    "011659109" 2006 9  47.44657  -122.2721
    "011659109" 2007 9  47.44657  -122.2721
    "011659109" 2008 9  47.44657  -122.2721
    "012348108" 2006 1  43.30168  -70.99177
    "012348108" 2007 1  43.30168  -70.99177
    "012348108" 2008 1  43.30168  -70.99177
    "012653101" 2006 5 35.178715  -80.85399
    "012653101" 2007 5 35.178715  -80.85399
    "012653101" 2008 5 35.178715  -80.85399
    "013078100" 2008 3  41.90414  -87.86066
    end
    Last edited by Ken Yang; 26 Feb 2019, 14:49.

  • #2
    geodist is from SSC, authored by Robert Picard (you are asked to explain). Note the misspelling of division, corrected in the code.

    Code:
    rename devision division
    egen group= group(year division), label
    preserve
    rename (*) (*2)
    rename group2 group
    tempfile file2
    save `file2'
    restore
    joinby group using `file2'
    bys group: egen count= count(id)
    drop if id==id2 & count>1
    *TO INSTALL TYPE ssc install geodist
    geodist latitude longitude latitude2 longitude2, gen(dist)
    bys id year group: egen tot_dist= total(dist)
    bys id year group: egen pairs= count(id)
    bys id year group: gen avg_dist= tot_dist/pairs

    Partial result:

    Code:
    . l id year division group pairs id2 division2 dist tot_dist avg_dist in 1/30, sepby(id year)
    
         +-------------------------------------------------------------------------------------------------------+
         |        id   year   division    group   pairs         id2   divisi~2        dist   tot_dist   avg_dist |
         |-------------------------------------------------------------------------------------------------------|
      1. | 000361105   2006          3   2006 3       2   002896207          3   483.28973    524.329   262.1645 |
      2. | 000361105   2006          3   2006 3       2   002824100          3   41.039266    524.329   262.1645 |
         |-------------------------------------------------------------------------------------------------------|
      3. | 000361105   2007          3   2007 3       2   002824100          3   41.039266    524.329   262.1645 |
      4. | 000361105   2007          3   2007 3       2   002896207          3   483.28973    524.329   262.1645 |
         |-------------------------------------------------------------------------------------------------------|
      5. | 000361105   2008          3   2008 3       4   002896207          3   483.28973   955.4761    238.869 |
      6. | 000361105   2008          3   2008 3       4   013078100          3   12.125778   955.4761    238.869 |
      7. | 000361105   2008          3   2008 3       4   001547108          3   419.02136   955.4761    238.869 |
      8. | 000361105   2008          3   2008 3       4   002824100          3   41.039266   955.4761    238.869 |
         |-------------------------------------------------------------------------------------------------------|
      9. | 000886309   2006          4   2006 4       1   00770F104          4   728.30868   728.3087   728.3087 |
         |-------------------------------------------------------------------------------------------------------|
     10. | 000886309   2007          4   2007 4       1   00770F104          4   728.30868   728.3087   728.3087 |
         |-------------------------------------------------------------------------------------------------------|
     11. | 000886309   2008          4   2008 4       1   00770F104          4   728.30868   728.3087   728.3087 |
         |-------------------------------------------------------------------------------------------------------|
     12. | 000957100   2006          2   2006 2       3   008190100          2    32.95941   308.1753   102.7251 |
     13. | 000957100   2006          2   2006 2       3   009158106          2   137.03838   308.1753   102.7251 |
     14. | 000957100   2006          2   2006 2       3   009363102          2    138.1775   308.1753   102.7251 |
         |-------------------------------------------------------------------------------------------------------|
     15. | 000957100   2007          2   2007 2       3   009363102          2    138.1775   308.1753   102.7251 |
     16. | 000957100   2007          2   2007 2       3   008190100          2    32.95941   308.1753   102.7251 |
     17. | 000957100   2007          2   2007 2       3   009158106          2   137.03838   308.1753   102.7251 |
         |-------------------------------------------------------------------------------------------------------|
     18. | 000957100   2008          2   2008 2       3   009158106          2   137.03838   308.1753   102.7251 |
     19. | 000957100   2008          2   2008 2       3   009363102          2    138.1775   308.1753   102.7251 |
     20. | 000957100   2008          2   2008 2       3   008190100          2    32.95941   308.1753   102.7251 |
         |-------------------------------------------------------------------------------------------------------|
     21. | 001084102   2006          5   2006 5       5   00751Y106          5   549.87399   2634.266   526.8533 |
     22. | 001084102   2006          5   2006 5       5   00508Y102          5   29.475828   2634.266   526.8533 |
     23. | 001084102   2006          5   2006 5       5   004498101          5   892.82794   2634.266   526.8533 |
     24. | 001084102   2006          5   2006 5       5   012653101          5    330.2164   2634.266   526.8533 |
     25. | 001084102   2006          5   2006 5       5   00130H105          5   831.87218   2634.266   526.8533 |
         |-------------------------------------------------------------------------------------------------------|
     26. | 001084102   2007          5   2007 5       5   00508Y102          5   29.475828   2634.266   526.8533 |
     27. | 001084102   2007          5   2007 5       5   012653101          5    330.2164   2634.266   526.8533 |
     28. | 001084102   2007          5   2007 5       5   004498101          5   892.82794   2634.266   526.8533 |
     29. | 001084102   2007          5   2007 5       5   00130H105          5   831.87218   2634.266   526.8533 |
     30. | 001084102   2007          5   2007 5       5   00751Y106          5   549.87399   2634.266   526.8533 |
         +-------------------------------------------------------------------------------------------------------+
    Last edited by Andrew Musau; 27 Feb 2019, 04:57.

    Comment


    • #3
      Hi Andrew,

      Thank you so much for your help with the code. But the code with count function does not work (type mismatch r(109). I use Stata 14.1 for Mac.

      Code:
      bys group: egen count= count(id)
      bys id year group: egen pairs= count(id)
      I fix it by this way:
      Code:
      egen tag=tag(id group)
      bys group: egen count=total(tag)
      
      egen tag1= tag(id year group)
      bys id year group: egen pairs=total(tag1)
      I got "pairs" all equals to 1, which means there is something wrong here.

      Also, I just like each year, one observation for one address, I still find there are many duplicates, so I drop them out:
      Code:
      egen address= concat(id year),p(_)
      duplicates drop address, force
      Is the way I did correct? Can you help me with this please?

      I really appreciate your help again.

      Thank you.
      Last edited by Ken Yang; 27 Feb 2019, 14:41.

      Comment


      • #4
        Thank you so much for your help with the code. But the code with count function does not work (type mismatch r(109). I use Stata 14.1 for Mac.
        It works with the data in #1. For the second, no need to generate a tag1 variable. Just use

        Code:
        bys id year group: egen pairs= total(1)
        Also, I just like each year, one observation for one address, I still find there are many duplicates, so I drop them out:
        I would suggest collapse

        Code:
        collapse division latitude longitude pairs avg_dist, by(id year)
        replace pairs=0 if avg_dist==0

        Comment


        • #5
          Hi Andrew,

          Everything runs perfectly with my real data now.

          Thank you so much again.

          All the best to you!

          Comment

          Working...
          X