Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting occurances of dummy's in specific sub-groups to create count variable for a different group - Panel Data

    Stata 14.0 SE on Mac - Data contais 90,000 obervations

    Hi. I am using panel data on US counties (and States) and Medical Marijuana Laws. I am trying to create a count variable which increases by 1 in one state, when a dummy variable is activated in any of the bordering states. This would mean that the count variable would increase by 1 any time a neighbouring state dummy is activated, and decreases by one any time the neighbour dummy is deactivated.

    Below, s_id is the stae indicator - with multiple counties per state (although the example below only shows one county per state); fips is the unique county code; c_id is the county within state code, MML is an indicator = 1 if state i has introduced a medical marijuana law in year t.

    The Neighbour_MML variable is what I am trying to create. If we assume that s_id 1 is neighboured by s_id's 3 and 4; s_id 2 has no nieghbours; s_id 3 neighbours s_id 4; s_id 4 neighbour s_id's 3 and 5; and s_id 5 only neighbour s_id 4:
    Dependent Neighbour
    1 3
    1 4
    3 1
    4 1
    4 5
    5 4
    With this neighbour pairing combination the Neighbour_MML count variable would look as it does below:
    year fips c_id s_id MML Neighbour_MML
    1990 1001 1 1 0 0
    1991 1001 1 1 0 0
    1992 1001 1 1 1 1
    1993 1001 1 1 1 2
    1994 1001 1 1 1 2
    1990 2001 1 2 0 0
    1991 2001 1 2 0 0
    1992 2001 1 2 0 0
    1993 2001 1 2 1 0
    1994 2001 1 2 1 0
    1990 3001 1 3 0 0
    1991 3001 1 3 0 0
    1992 3001 1 3 0 1
    1993 3001 1 3 1 1
    1994 3001 1 3 1 1
    1990 4001 1 4 0 0
    1991 4001 1 4 0 0
    1992 4001 1 4 1 0
    1993 4001 1 4 1 1
    1994 4001 1 4 1 2
    1990 5001 1 5 0 0
    1991 5001 1 5 0 0
    1992 5001 1 5 0 1
    1993 5001 1 5 0 1
    1994 5001 1 5 1 1
    I have all the state pairing but need some assistance in understaing how I can generate the Neighbour_MML count variable for all states based on all pairings?


    Any help greatly appreciated,
    Thanks,
    Mike
    Last edited by Mike McRae; 02 Jun 2021, 09:54.

  • #2
    Your description and table do not match. Here is one way:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(dependent neighbour)
    1 3
    1 4
    3 4
    4 3
    4 5
    5 4
    end
    rename neighbour s_id
    tempfile neighbors
    save `neighbors'
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(year fips) byte(c_id s_id mml neighbour_mml)
    1990 1001 1 1 0 0
    1991 1001 1 1 0 0
    1992 1001 1 1 1 1
    1993 1001 1 1 1 2
    1994 1001 1 1 1 2
    1990 2001 1 2 0 0
    1991 2001 1 2 0 0
    1992 2001 1 2 0 0
    1993 2001 1 2 1 0
    1994 2001 1 2 1 0
    1990 3001 1 3 0 0
    1991 3001 1 3 0 0
    1992 3001 1 3 0 1
    1993 3001 1 3 1 1
    1994 3001 1 3 1 1
    1990 4001 1 4 0 0
    1991 4001 1 4 0 0
    1992 4001 1 4 1 0
    1993 4001 1 4 1 1
    1994 4001 1 4 1 2
    1990 5001 1 5 0 0
    1991 5001 1 5 0 0
    1992 5001 1 5 0 1
    1993 5001 1 5 0 1
    1994 5001 1 5 1 1
    end
    preserve
    keep year s_id mml
    joinby s_id using `neighbors'
    collapse (sum) mml, by(year dependent)
    rename (mml dependent) (wanted s_id)
    save `neighbors', replace
    restore
    merge 1:1 year s_id using `neighbors', nogen
    replace wanted=0 if missing(wanted)
    assert wanted== neighbour_mml
    Res.:

    Code:
    . sort s_id year
    
    . l, sepby(s_id)
    
         +-----------------------------------------------------+
         | year   fips   c_id   s_id   mml   neighb~l   wanted |
         |-----------------------------------------------------|
      1. | 1990   1001      1      1     0          0        0 |
      2. | 1991   1001      1      1     0          0        0 |
      3. | 1992   1001      1      1     1          1        1 |
      4. | 1993   1001      1      1     1          2        2 |
      5. | 1994   1001      1      1     1          2        2 |
         |-----------------------------------------------------|
      6. | 1990   2001      1      2     0          0        0 |
      7. | 1991   2001      1      2     0          0        0 |
      8. | 1992   2001      1      2     0          0        0 |
      9. | 1993   2001      1      2     1          0        0 |
     10. | 1994   2001      1      2     1          0        0 |
         |-----------------------------------------------------|
     11. | 1990   3001      1      3     0          0        0 |
     12. | 1991   3001      1      3     0          0        0 |
     13. | 1992   3001      1      3     0          1        1 |
     14. | 1993   3001      1      3     1          1        1 |
     15. | 1994   3001      1      3     1          1        1 |
         |-----------------------------------------------------|
     16. | 1990   4001      1      4     0          0        0 |
     17. | 1991   4001      1      4     0          0        0 |
     18. | 1992   4001      1      4     1          0        0 |
     19. | 1993   4001      1      4     1          1        1 |
     20. | 1994   4001      1      4     1          2        2 |
         |-----------------------------------------------------|
     21. | 1990   5001      1      5     0          0        0 |
     22. | 1991   5001      1      5     0          0        0 |
     23. | 1992   5001      1      5     0          1        1 |
     24. | 1993   5001      1      5     0          1        1 |
     25. | 1994   5001      1      5     1          1        1 |
         +-----------------------------------------------------+
    
    .

    Comment


    • #3
      Thanks for pointing out the error in the description.

      This worked perfectly, although I should mention that since there were multiple counties per s_id, I collapsed mml by year and s_id first, and then merged many to one once the new variable was created.

      code:
      clear
      input byte(dependent neighbour)
      1 3
      1 4
      3 4
      4 3
      4 5
      5 4
      end
      rename neighbour s_id
      tempfile neighbors
      save `neighbors'
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int(year fips) byte(c_id s_id mml)
      1990 1001 1 1 0
      1991 1001 1 1 0
      1992 1001 1 1 1
      1993 1001 1 1 1
      1994 1001 1 1 1
      1990 1002 1 1 0
      1991 1002 1 1 0
      1992 1002 1 1 1
      1993 1002 1 1 1
      1994 1002 1 1 1
      1990 2001 1 2 0
      1991 2001 1 2 0
      1992 2001 1 2 0
      1993 2001 1 2 1
      1994 2001 1 2 1
      1990 2001 1 2 0
      1991 2001 1 2 0
      1992 2001 1 2 0
      1993 2001 1 2 1
      1994 2001 1 2 1
      1990 3001 1 3 0
      1991 3001 1 3 0
      1992 3001 1 3 0
      1993 3001 1 3 1
      1994 3001 1 3 1
      1990 4001 1 4 0
      1991 4001 1 4 0
      1992 4001 1 4 1
      1993 4001 1 4 1
      1994 4001 1 4 1
      1990 5001 1 5 0
      1991 5001 1 5 0
      1992 5001 1 5 0
      1993 5001 1 5 0
      1994 5001 1 5 1
      end
      preserve
      collapse mml, by(year s_id)
      joinby s_id using `neighbors'
      collapse (sum) mml, by(year dependent)
      rename (mml dependent) (wanted s_id)
      save `neighbors', replace
      restore
      merge m:1 year s_id using `neighbors', nogen
      replace wanted=0 if missing(wanted)

      Just a quick question, what does saving a file using the `filename' code do differently do saving the file to a standard filepath, i.e. save neighbours.dta?

      Comment


      • #4
        Just a quick question, what does saving a file using the `filename' code do differently do saving the file to a standard filepath, i.e. save neighbours.dta?
        It creates a temporary file holding the dataset, which disappears once you end your Stata session. As these are intermediate datasets that you use in the process of creating your final dataset, there is no point of storing them in your computer. If you do a bit of programming, you will realize that you have a lot of unnecessary files littering your computer if you do not get into the habit of using temporary files. Better still, frames (Stata 16+) to a large extent replace temporary files, so you may need to familiarize yourself with them too. For more, see

        Code:
        help tempfile
        help frames

        Comment

        Working...
        X