Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging Shapefiles

    Goodmorning to everybody,
    I am working with a dataset that contains the paths of all the tornadoes in the USA from 1950 to 2020 ("slat" and "slon" are the starting points and "elat"-"elon" ending points")

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long om int yr byte(mo dy) str10 date str8 time byte tz str2 st byte stf int stn byte mag int(inj fat) double(loss closs slat slon elat elon len) int wid byte fc long _ID
    192 1950 10  1 "1950-10-01" "21:00:00" 3 "OK" 40 23 1  0  0 4 0 36.73 -102.52              36.88              -102.3 15.8   10 0   1
    193 1950 10  9 "1950-10-09" "02:15:00" 3 "NC" 37  9 3  3  0 5 0 34.17   -78.6 34.170100000000005  -78.59989999999999    2  880 0   2
    195 1950 11 20 "1950-11-20" "02:20:00" 3 "KY" 21  1 2  0  0 5 0 37.37   -87.2            37.3701            -87.1999   .1   10 0   3
    196 1950 11 20 "1950-11-20" "04:00:00" 3 "KY" 21  2 1  0  0 5 0  38.2   -84.5 38.200100000000006            -84.4999   .1   10 0   4
    197 1950 11 20 "1950-11-20" "07:30:00" 3 "MS" 28 14 1  3  0 4 0 32.42  -89.13 32.420100000000005  -89.12989999999999    2   37 0   5
    194 1950 11  4 "1950-11-04" "17:00:00" 3 "PA" 42  5 3  1  0 5 0  40.2  -76.12               40.4              -75.93 15.9  100 0   6
    198 1950 12  2 "1950-12-02" "15:00:00" 3 "IL" 17  7 2  3  1 4 0 38.97  -90.05              39.07              -89.72 18.8   50 0   7
    199 1950 12  2 "1950-12-02" "16:00:00" 3 "IL" 17  8 3 25  2 6 0 38.75  -89.67               38.9              -89.38   18  200 0   8
    200 1950 12  2 "1950-12-02" "16:25:00" 3 "AR"  5 12 3  0  0 1 0 36.12  -91.83              36.18              -91.72  7.8   10 0   9
    201 1950 12  2 "1950-12-02" "17:30:00" 3 "IL" 17  9 1  0  0 4 0 38.17  -89.78              38.22              -89.62  9.6   50 0  10
      4 1950  1 13 "1950-01-13" "05:25:00" 3 "AR"  5  1 3  1  1 3 0  34.4  -94.37            34.4001            -94.3699   .6   17 0  11
      5 1950  1 25 "1950-01-25" "19:30:00" 3 "MO" 29  2 2  5  0 5 0  37.6  -90.68              37.63              -90.65  2.3  300 0  12
      6 1950  1 25 "1950-01-25" "21:00:00" 3 "IL" 17  3 2  0  0 5 0 41.17  -87.33 41.170100000000005            -87.3299   .1  100 0  13
      7 1950  1 26 "1950-01-26" "18:00:00" 3 "TX" 48  1 2  2  0 0 0 26.88  -98.12              26.88              -98.05  4.7  133 0  14
      1 1950  1  3 "1950-01-03" "11:00:00" 3 "MO" 29  1 3  3  0 6 0 38.77  -90.22              38.83              -90.03  9.5  150 0  15
      2 1950  1  3 "1950-01-03" "11:55:00" 3 "IL" 17  2 3  3  0 5 0  39.1   -89.3              39.12              -89.23  3.6  130 0  16
      3 1950  1  3 "1950-01-03" "16:00:00" 3 "OH" 39  1 1  1  0 4 0 40.88  -84.58 40.880100000000006            -84.5799   .1   10 0  17
      8 1950  2 11 "1950-02-11" "13:10:00" 3 "TX" 48  2 2  0  0 4 0 29.42  -95.25              29.52              -95.13  9.9  400 0  18
      9 1950  2 11 "1950-02-11" "13:50:00" 3 "TX" 48  3 3 12  1 4 0 29.67  -95.05              29.83                 -95   12 1000 0  19
     10 1950  2 11 "1950-02-11" "21:00:00" 3 "TX" 48  4 2  5  0 5 0 32.35   -95.2              32.42               -95.2  4.6  100 0  20
     11 1950  2 11 "1950-02-11" "23:55:00" 3 "TX" 48  5 2  6  0 5 0 32.98  -94.63                 33               -94.7  4.5   67 0  21
     12 1950  2 12 "1950-02-12" "00:30:00" 3 "TX" 48  6 2  8  1 4 0 33.33  -94.42              33.45              -94.42    8  833 0  22
     13 1950  2 12 "1950-02-12" "01:15:00" 3 "TX" 48  7 1  0  0 4 0 32.08  -98.35               32.1              -98.33  2.3  233 0  23
     14 1950  2 12 "1950-02-12" "06:10:00" 3 "TX" 48  8 2  0  0 4 0 31.52  -96.55              31.57              -96.55  3.4  100 0  24
     15 1950  2 12 "1950-02-12" "11:57:00" 3 "TX" 48  9 1 32  0 5 0  31.8   -94.2              31.88              -94.12  7.7  100 0  25
     16 1950  2 12 "1950-02-12" "12:00:00" 3 "MS" 28  1 2  2  3 4 0  34.6  -89.12 34.600100000000005            -89.1199   .1   10 0  26
     17 1950  2 12 "1950-02-12" "12:00:00" 3 "MS" 28  2 1  0  0 0 0  34.6  -89.12 34.600100000000005            -89.1199    2   10 0  27
     18 1950  2 12 "1950-02-12" "12:00:00" 3 "TX" 48 10 3 15  3 5 0  31.8   -94.2               31.8              -94.18  1.9   50 0  28
     19 1950  2 12 "1950-02-12" "12:30:00" 3 "AR"  5  2 2  0  0 3 0 34.48   -92.4            34.4801            -92.3999   .1  100 0  29
     20 1950  2 12 "1950-02-12" "13:00:00" 3 "LA" 22  1 4 77 18 5 0 31.97     -94                 33               -93.3 82.6  100 0  30
     21 1950  2 12 "1950-02-12" "13:20:00" 3 "LA" 22  2 2 10  5 5 0  32.2  -93.58              32.97              -93.17 58.4  100 0  31
     22 1950  2 12 "1950-02-12" "14:00:00" 3 "LA" 22  4 3 25  5 5 0 31.63  -93.65              32.55              -93.03 74.5  100 0  32
     23 1950  2 12 "1950-02-12" "15:00:00" 3 "AR"  5  3 2  0  0 4 0 33.27  -92.95              33.35              -92.95  5.7  100 0  33
     24 1950  2 12 "1950-02-12" "23:00:00" 3 "LA" 22  5 1  0  0 4 0  32.6  -91.33 32.600100000000005            -91.3299   .5   33 0  34
     25 1950  2 13 "1950-02-13" "01:00:00" 3 "TN" 47  1 1  8  0 3 0 35.35  -89.77 35.350100000000005  -89.76989999999999   .2   10 0  35
     26 1950  2 13 "1950-02-13" "02:00:00" 3 "TN" 47  2 2  1  9 4 0 35.75  -89.48            35.7501            -89.4799   .2   10 0  36
     27 1950  2 27 "1950-02-27" "10:20:00" 3 "OK" 40  1 2  0  0 4 0 35.55   -97.6            35.5501  -97.59989999999999    2   50 0  37
     29 1950  3 16 "1950-03-16" "09:15:00" 3 "FL" 12  1 2  0  0 3 0 29.65  -81.22            29.6501            -81.2199  1.5  150 0  38
     30 1950  3 19 "1950-03-19" "07:30:00" 3 "LA" 22  6 1  2  0 4 0 30.45  -93.45            30.4501            -93.4499    2   33 0  39
     31 1950  3 19 "1950-03-19" "13:15:00" 3 "LA" 22  7 2  0  0 4 0  30.1     -91            30.1001            -90.9999    1   50 0  40
     32 1950  3 19 "1950-03-19" "13:15:00" 3 "LA" 22  8 0  0  0 4 0  29.7   -90.1              29.67               -89.8 18.1   27 0  41
     33 1950  3 26 "1950-03-26" "19:30:00" 3 "AR"  5  4 2  3  0 4 0 34.12  -93.07              34.32              -92.88 17.4  150 0  42
     34 1950  3 26 "1950-03-26" "19:31:00" 3 "AR"  5  5 3  1  0 5 0 36.15  -91.83               36.2              -91.75  5.7  200 0  43
     35 1950  3 26 "1950-03-26" "20:30:00" 3 "AR"  5  6 2  7  0 5 0  34.7  -92.35               34.8              -92.22 10.4  600 0  44
     36 1950  3 26 "1950-03-26" "21:15:00" 3 "AR"  5  7 3 20  0 5 0 34.98  -91.73              35.08               -91.5 14.9 1760 0  45
     37 1950  3 26 "1950-03-26" "23:30:00" 3 "AR"  5  8 2  2  0 0 0  35.1   -91.4              35.15              -91.33  5.4  833 0  46
     38 1950  3 27 "1950-03-27" "03:00:00" 3 "OK" 40  2 2  0  0 3 0 34.85  -95.75 34.850100000000005            -95.7499   .1   77 0  47
     39 1950  3 27 "1950-03-27" "05:00:00" 3 "MS" 28  6 2  2  1 0 0 33.17  -90.55 33.170100000000005            -90.5499   .1   10 0  48
     40 1950  3 27 "1950-03-27" "06:00:00" 3 "LA" 22  9 1  2  0 4 0 32.63  -91.28              32.73              -91.05 15.3   17 0  49
     41 1950  3 27 "1950-03-27" "07:30:00" 3 "MS" 28  7 2  0  0 0 0 33.13  -89.57              33.25              -89.42 11.9   10 0  50
     42 1950  3 27 "1950-03-27" "07:45:00" 3 "MS" 28  4 2  7  0 5 0 32.28  -90.27              32.98              -89.82 59.3   50 0  51
     43 1950  3 27 "1950-03-27" "07:45:00" 3 "MS" 28  5 2  6  0 5 0 32.28  -90.27              32.48              -89.78 31.9   50 0  52
     44 1950  3 27 "1950-03-27" "08:30:00" 3 "LA" 22 10 1  0  0 4 0 30.92  -91.13            30.9201  -91.12989999999999   .5   27 0  53
     45 1950  3 27 "1950-03-27" "08:45:00" 3 "MS" 28  9 2  0  0 0 0 33.67  -88.78              33.72              -88.73  4.7   10 0  54
     46 1950  3 27 "1950-03-27" "11:00:00" 3 "MS" 28  8 1  2  0 4 0 34.73  -88.27            34.7301  -88.26989999999999   .1   10 0  55
     47 1950  3 27 "1950-03-27" "15:00:00" 3 "TN" 47  3 1  0  0 3 0 35.68  -85.77            35.6801  -85.76989999999999   .2   10 0  56
     48 1950  3 27 "1950-03-27" "16:30:00" 3 "IL" 17  4 2  0  0 3 0 40.57   -88.6               40.6              -88.55    3   50 0  57
     28 1950  3  1 "1950-03-01" "02:30:00" 3 "MS" 28  3 1  0  0 0 0  32.5  -88.85            32.5001  -88.84989999999999   .1   10 0  58
     53 1950  4 18 "1950-04-18" "01:30:00" 3 "AL"  1  1 3 15  0 4 0 30.67   -88.2              30.85               -88.1   14  100 0  59
     54 1950  4 18 "1950-04-18" "01:45:00" 3 "AL"  1  2 2  0  0 3 0  30.7  -87.92            30.7001            -87.9199    2  150 0  60
     55 1950  4 24 "1950-04-24" "07:45:00" 3 "LA" 22 11 1  0  0 4 0  30.1  -90.03            30.1001            -90.0299    2  100 0  61
     56 1950  4 27 "1950-04-27" "16:00:00" 3 "SC" 45  1 1  0  0 4 0 33.07     -80 33.070100000000004            -79.9999   .7   37 0  62
     57 1950  4 28 "1950-04-28" "14:17:00" 3 "OK" 40  5 3  1  1 5 0 34.88  -99.28              35.17               -99.2 20.8  400 0  63
     58 1950  4 28 "1950-04-28" "17:00:00" 3 "KS" 20  1 1  0  0 4 0 38.12  -95.85              38.35              -95.63 19.8   10 0  64
     59 1950  4 28 "1950-04-28" "18:00:00" 3 "TX" 48 11 4  5  5 5 0 32.42   -99.5              32.42              -99.48  1.3  233 0  65
     60 1950  4 28 "1950-04-28" "19:05:00" 3 "OK" 40  6 4 32  5 5 0 35.08   -96.4              35.13              -96.35  4.5  200 0  66
     61 1950  4 28 "1950-04-28" "19:30:00" 3 "OK" 40  7 2  0  0 4 0 34.55   -96.2            34.5501            -96.1999   .8  100 0  67
     62 1950  4 29 "1950-04-29" "12:00:00" 3 "MS" 28 10 2  0  0 5 0 31.27  -89.83              31.38               -89.7 11.3  100 0  68
     63 1950  4 29 "1950-04-29" "15:30:00" 3 "TX" 48 12 1  0  0 4 0  31.9   -98.6              31.73               -98.6 11.5  200 0  69
     49 1950  4  2 "1950-04-02" "15:30:00" 3 "AR"  5  9 3  8  1 3 0 35.75  -91.22            35.7501            -91.2199  2.5   10 0  70
     50 1950  4  2 "1950-04-02" "18:00:00" 3 "OK" 40  3 2  0  0 4 0 35.82  -97.02 35.820100000000004  -97.01989999999999    1  100 0  71
     51 1950  4  2 "1950-04-02" "18:00:00" 3 "AR"  5 10 1  0  0 3 0 35.98  -89.92            35.9801            -89.9199   .8  100 0  72
     52 1950  4  3 "1950-04-03" "16:15:00" 3 "OK" 40  4 1  0  0 3 0 36.13  -95.83 36.130100000000006            -95.8299    1   33 0  73
     95 1950  5 10 "1950-05-10" "13:00:00" 3 "WY" 56  1 2  0  0 1 0 41.18 -110.42            41.1801           -110.4199    2   33 0  74
     96 1950  5 11 "1950-05-11" "18:30:00" 3 "OK" 40 15 1  0  0 2 0 36.82 -100.52 36.820100000000004 -100.51989999999999   .5   77 0  75
     97 1950  5 12 "1950-05-12" "14:00:00" 3 "NC" 37  1 1  0  0 4 0 35.75  -77.42            35.7501            -77.4199    2   10 0  76
     98 1950  5 12 "1950-05-12" "17:00:00" 3 "NC" 37  2 1  0  0 0 0 35.73   -78.4            35.7301            -78.3999    1  200 0  77
     99 1950  5 12 "1950-05-12" "17:00:00" 3 "NC" 37  3 0  0  0 0 0 35.73   -78.4            35.7301            -78.3999    1  200 0  78
    100 1950  5 13 "1950-05-13" "11:30:00" 3 "LA" 22 19 1  0  0 4 0 31.15  -91.95            31.1501            -91.9499    2   27 0  79
    101 1950  5 14 "1950-05-14" "16:00:00" 3 "TX" 48 14 1  0  0 3 0 29.53  -95.45            29.5301            -95.4499    1  100 0  80
    102 1950  5 14 "1950-05-14" "17:00:00" 3 "NC" 37  4 1  0  0 3 0  35.3   -79.8              35.33              -79.75  3.8   50 0  81
    103 1950  5 14 "1950-05-14" "17:30:00" 3 "NC" 37  5 2  5  0 0 0    35  -80.68            35.0001            -80.6799    2   10 0  82
    104 1950  5 15 "1950-05-15" "00:01:00" 3 "TX" 48 15 2  0  0 4 0 29.78  -98.83              29.67              -98.57   18   10 0  83
    105 1950  5 15 "1950-05-15" "11:00:00" 3 "FL" 12  3 1  0  0 4 0 28.58  -81.37 28.580099999999998            -81.3699   .1   10 0  84
    106 1950  5 15 "1950-05-15" "11:00:00" 3 "FL" 12  4 2  0  0 4 0  28.5  -81.37            28.5001            -81.3699   .1   10 0  85
    107 1950  5 16 "1950-05-16" "18:00:00" 3 "KS" 20 11 2  0  0 3 0 37.55  -98.42            37.5501            -98.4199   .2  200 0  86
    108 1950  5 16 "1950-05-16" "19:30:00" 3 "OK" 40 16 1  1  0 3 0 35.82  -99.12              35.87                 -99  7.3  100 0  87
    109 1950  5 18 "1950-05-18" "15:15:00" 3 "NE" 31  2 1  0  0 4 0 40.22 -100.63            40.2201 -100.62989999999999   .7  333 0  88
    110 1950  5 18 "1950-05-18" "19:00:00" 3 "KS" 20 12 3  2  0 5 0 38.12  -96.82              38.33              -96.82 14.9  100 0  89
    111 1950  5 18 "1950-05-18" "19:00:00" 3 "KS" 20 13 0  0  0 3 0 39.85   -98.3 39.850100000000005            -98.2999   .1   10 0  90
    112 1950  5 19 "1950-05-19" "01:30:00" 3 "KS" 20 14 0  0  0 3 0 37.82  -96.85 37.820100000000004  -96.84989999999999   .1   10 0  91
    113 1950  5 22 "1950-05-22" "12:39:00" 3 "OK" 40 17 1  0  0 0 0 35.67  -99.23 35.670100000000005            -99.2299  1.5  100 0  92
    114 1950  5 23 "1950-05-23" "18:00:00" 3 "KS" 20 15 0  0  0 4 0 39.75  -99.32            39.7501  -99.31989999999999   .2  250 0  93
    115 1950  5 24 "1950-05-24" "13:30:00" 3 "KS" 20 16 2  1  0 3 0 37.27  -99.48              37.28              -99.42  3.6   77 0  94
    116 1950  5 24 "1950-05-24" "17:30:00" 3 "KS" 20 17 1  0  0 4 0 39.38  -96.97              39.45              -96.78 10.7   40 0  95
    117 1950  5 24 "1950-05-24" "21:30:00" 3 "OK" 40 18 2  2  0 4 0 36.45  -99.03 36.450100000000006            -99.0299    1   33 0  96
    118 1950  5 25 "1950-05-25" "16:30:00" 3 "TX" 48 16 1  0  0 3 0 32.67 -101.88              32.65             -101.85    3  880 0  97
    119 1950  5 29 "1950-05-29" "14:48:00" 3 "OK" 40 19 1  0  0 2 0 36.38  -95.27 36.380100000000006  -95.26989999999999    1   33 0  98
    120 1950  5 29 "1950-05-29" "17:00:00" 3 "TX" 48 17 1  0  0 0 0 33.35  -96.92               33.3              -96.92  3.4  267 0  99
    121 1950  5 29 "1950-05-29" "23:00:00" 3 "LA" 22 20 1 13  0 4 0  32.1   -93.9 32.100100000000005            -93.8999    1   27 0 100
    end
    It sould allow to obtain a map like that:


    Click image for larger version

Name:	tornado.PNG
Views:	1
Size:	525.4 KB
ID:	1655827


    I would like to overlap this to a CENSUS Shapefile (https://www.census.gov/cgi-bin/geo/shapefiles/index.php), for example the ZCTA shapefile, in order to build a variable that count the number of tornadoes that have hitten a ZCTA. I would like to have a measure of exposure to tornadoes over the 1950-2020 period in the end.
    Could you suggest me how to process using STATA 17\MP?
    Thank you!

  • #2
    Robert Picard's geoinpoly command is your friend here. You can just spatially merge the location of the tornado to your master shapefile.


    If you've issues beyond this, there may be another command that does what you seek, and if not, buckle up, because then you'll need to take a walk in Python-ville.

    Comment


    • #3
      Hello Jared,
      I have use geoinpoly with other datasets, but I fear in this case it will not work. Geoinpoly does fit points into polygons, hence it should depict just the starrting and ending point of each tornado, while I am interested in the path between the two points (and theorically also the width of the tornado). Am I wrong?

      Comment


      • #4
        So let me make sure I understand you, you don't just want where the tornado happened, you want to know... all the places that the tornado hit? Is that about right?


        If so, assuming each tornado has a unique ID that tracks its path in the shapefile........... I imagine that geoinpoly will match the tornado to everywhere it followed too, right? Quite an interesting question, I must say.

        Comment


        • #5
          exactly, I want to have a variable, say "tornado_hit==1" if a tornado has passed by that ZIP, not just if a tornado started\ended in that ZIP.
          Each tornado has a unique ID in the shapefile, but there is no a series of point through which its paths goes, just the starting and ending point, so STATA should recreate the paths itself and than match it with the hitten ZIP. Hence I am not sure whether geoinpoly can help me

          Comment


          • #6
            Here is oneway. Ipolating 18 points between the start and end of each tornado path and then use Robert Picard's -geoinpoly- :

            Code:
             
            copy https://www2.census.gov/geo/tiger/GENZ2020/shp/cb_2020_us_zcta520_500k.zip  zcta.zip, replace
            unzipfile  zcta.zip, replace
            spshape2dta    cb_2020_us_zcta520_500k, saving(zcta) replace
            use zcta, clear
            keep _ID -ZCTA
            rename ZCTA zipcode
            save,replace
            
            //Input tornado data
            // clear
            // input long om int yr ...
            //
            // end
            
            keep om- time st stf slat- elon _ID
            rename (_ID slat elat slon elon) (id lat1 lat2 lon1 lon2)
            reshape long lat lon , i(id)
            //ipolate 19 points between start and end of each tornado path
            expand 18 if _j == 1, gen(expand_data)
            sort id _j exp
            replace lat = . if expand_data == 1
            replace lon = . if expand_data == 1
            bys id: gen order = _n
            l in 1/10
            by id : ipolate lat order, gen(lat_ip)
            by id : ipolate lon order, gen(lon_ip)
            
            geoinpoly lat_ip lon_ip using zcta_shp, noprojection
            merge m:1 _ID using zcta
            keep if _m ==3
            sort id order
            l in 1/10
            
            save tornado_zip, replace
            //Example where tornado cross multiple zip codes
            keep if id == 6
            keep _ID zipcode
            duplicates drop
            spmap using zcta_shp, id(_ID) osize(0.05 ..)   /// 
              point(data(tornado_zip.dta) x(lon_ip) y(lat_ip) select(keep if id == 6)) ///
              label(data(tornado_zip.dta) x(_CX) y(_CY) label(zipcode) select(keep if id == 6) size(*.9))
            Click image for larger version

Name:	Graph.png
Views:	1
Size:	72.7 KB
ID:	1655964

            Comment


            • #7
              Thank you very much Scott you are really helpful! I am going to work with your code for sure!
              Theorically, if I increase the number of points I can approximate better a line, am I right?

              Comment


              • #8
                Yes, given how oddly shaped and small zip code areas can be, a larger number of points may be a good idea.

                Comment


                • #9
                  As a rule, the less missing data tu have, the better the interpolation

                  Comment

                  Working...
                  X