Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thank you very much for answering.
    I have also tried to use the rangejoin adapting from the above version. But I also did not quite get what I wish.
    I did
    Code:
    clear all
    
    use ill2, clear
    gen `c(obs_t)' obs_no = _n
    frame put _all if !missing(illdate), into(working)
    frame working {
        gen lower = illdate + 1
        gen upper = illdate + 2900
        replace lower = 1 if missing(illdate)
        replace upper = 0 if missing(illdate)
        format lower upper %td
        rangejoin crimedate lower upper using crimes2, by(id)
        by obs_no, sort: egen crimes_365 = total(crimedate > illdate + 1)
        by obs_no: egen crimes_2900 = total(crimedate > illdate + 2900)
        gen byte any_crime_365 = crimes_365 > 0 & !missing(crimes_365)
        gen byte any_crime_2900 = crimes_2900 > 0 & !missing(crimes_2900)
        by obs_no, sort: keep if _n == 1
        
    }
    frlink 1:1 obs_no, frame(working)
    frget crimes_* any_crime_*, from(working)
    and I got:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id yearexam yearill monthill dayill score illdate) byte(obs_no working) float(crimes_365 crimes_2190) byte(any_crime_365 any_crime_2190)
    1 2000 2000 11 3 100 14917 1 1 3 0 1 0
    1 2001    .  . . 150     . 2 . . . . .
    1 2002    .  . . 111     . 3 . . . . .
    2 2000    .  . . 123     . 4 . . . . .
    2 2001 2001  5 2 200 15097 5 2 2 0 1 0
    2 2002    .  . . 214     . 6 . . . . .
    2 2003    .  . . 203     . 7 . . . . .
    2 2004    .  . . 302     . 8 . . . . .
    2 2005    .  . . 136     . 9 . . . . .
    end
    format %td illdate
    But what I really want to get is for example, person ID 1 should receive a count of crime committed after disease in the year 2000 = 0. The ID 1 should receive
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id yearexam yearill monthill dayill score illdate ncrimesinyear criminal) byte totalcrimes
    1 2000 2000 11 3 100 14917 1 1 3
    1 2001    .  . . 150     . 0 0 3
    1 2002    .  . . 111     . 1 1 3
    2 2000    .  . . 123     . 0 0 2
    2 2001 2001  5 2 200 15097 1 1 2
    2 2002    .  . . 214     . 1 1 2
    2 2003    .  . . 203     . 0 0 2
    2 2004    .  . . 302     . 0 0 2
    2 2005    .  . . 136     . 0 0 2
    end
    format %td illdate
    So, I want to create a variable to calculate the number of crimes committed after illdate per year per person. If the person never got ill, but she appears in the ill2.dta, then I count the number of crimes in that yearexam anyway. That's is my focus is to exclude crimes that happened before the illdate (per person). I want to look just to crimes after the illdate. Another variable criminal takes 1 if the person committed any crime after the illdate (or if not ill any crime in that yearexam), 0 otherwise. And another variable with the total crimes committed after the illdate for all years. If the person appears as committing two crimes in exact same date, I am counting as just 1 crime. And I also would like to create these counts per type of crime (and in this case, I would count all crimes even if they happened in the same crimedate). Please, could you help me?

    Comment


    • #17
      I have also tried this:
      Code:
      clear all
      use ill2, clear
      
      * Generate a unique observation number for each row
      gen obs_no = _n
      
      * Create a new frame and put all data into it (regardless of illdate being missing)
      frame put _all illdate, into(working)
      
      * Work within the 'working' frame
      frame working {
          * Adjust the lower and upper bounds for the range join
          gen lower = cond(missing(illdate), date("01jan1960", "DMY"), illdate + 1)
          gen upper = cond(missing(illdate), date("31dec2050", "DMY"), illdate + 2190)
          format lower upper %td
      
          * Perform the range join to link with crimes that occurred within the specified period
          rangejoin crimedate lower upper using crimes2, by(id)
      
          * Count crimes that occurred within 365 days after illdate, considering missing illdate
          egen crimes_365 = total(!missing(crimedate) & (crimedate <= illdate + 365 | missing(illdate)))
          
          * Count crimes that occurred within 2190 days after illdate, considering missing illdate
          egen crimes_2190 = total(!missing(crimedate) & (crimedate <= illdate + 2190 | missing(illdate)))
          
          * Generate flags for any crime occurrence within the specified periods
          gen any_crime_365 = (crimes_365 > 0)
          gen any_crime_2190 = (crimes_2190 > 0)
      }
      It runs and I get this:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(id yearexam yearill monthill dayill score illdate obs_no)
      1 2000 2000 11 3 100 14917 1
      1 2001    .  . . 150     . 2
      1 2002    .  . . 111     . 3
      2 2000    .  . . 123     . 4
      2 2001 2001  5 2 200 15097 5
      2 2002    .  . . 214     . 6
      2 2003    .  . . 203     . 7
      2 2004    .  . . 302     . 8
      2 2005    .  . . 136     . 9
      end
      format %td illdate
      But when I run this:
      Code:
      frlink 1:1 obs_no, frame(working)
      I get this error message:
      invalid match variables for 1:1 or m:1 match
      The variable you specified for matching does not uniquely identify the
      observations in frame working. Each observation in the current frame
      default must link to one observation in working.

      and this:

      Code:
      frget crimes_365 crimes_2190 any_crime_365 any_crime_2190, from(working)
      variable working not found
      (error in option from())

      Comment


      • #18
        I'm still not sure I understand exactly what you want to do here. But perhaps it is this:
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(id yearcrime monthcrime daycrime typecrime crimedate)
        1 2000  8 15 1 14837
        1 2000  8 15 2 14837
        1 2000 12 25 3 14969
        1 2002  3  3 1 15402
        1 2003  5 31 4 15856
        2 2001  7  8 2 15164
        2 2002  9 12 1 15595
        3 2000  6 24 4 14785
        4 2002  3 17 1 15416
        4 2002  4 15 2 15445
        5 2003  2  2 3 15738
        6 2000  1  9 4 14618
        6 2003 10 10 1 15988
        6 2004  2  3 3 16104
        6 2005  9  9 2 16688
        9 2002  3  6 3 15405
        end
        format %td crimedate
        assert yearcrime == year(crimedate)
        keep id yearcrime crimedate
        rename yearcrime year
        duplicates drop
        tempfile crimes
        save `crimes'
        
        
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(id yearexam yearill monthill dayill score illdate)
         1 2000 2000 11  3 100 14917
         1 2001    .  .  . 150     .
         1 2002    .  .  . 111     .
         2 2000    .  .  . 123     .
         2 2001 2001  5  2 200 15097
         2 2002    .  .  . 214     .
         2 2003    .  .  . 203     .
         2 2004    .  .  . 302     .
         2 2005    .  .  . 136     .
         3 2001 2001  7  6 222 15162
         4 2001    .  .  . 158     .
         4 2002    .  .  . 178     .
         4 2003    .  .  . 228     .
         4 2004    .  .  . 311     .
         5 2000    .  .  . 197     .
         5 2001 2001  2 17 106 15023
         5 2002    .  .  . 147     .
         5 2003    .  .  . 241     .
         5 2004    .  .  . 299     .
         6 2002 2002  3  3 321 15402
         6 2003    .  .  . 139     .
         6 2004    .  .  . 284     .
        13 2002 2002  5 23 123 15483
        13 2005    .  .  . 214     .
        end
        format %td illdate
        by id (illdate), sort: replace illdate = illdate[1]
        gen lower = max(illdate+1, mdy(1, 1, yearexam))
        gen upper = mdy(12, 31, yearexam)
        rename yearexam year
        
        rangejoin crimedate lower upper using `crimes', by(id year)
        
        collapse (count) n_crimes_this_year_post_illness = crimedate ///
            (first) illdate score, by(id year)
        format n_crimes_this_year_post_illness %1.0f
        by id (year), sort: egen n_crimes_post_illness = ///
            total(n_crimes_this_year_post_illness)

        Comment


        • #19
          Thank you prof. Clyde. It worked. The problem is that I want to exclude obs. that committed crimes before getting ill; and I also have to deal with the fact that some people did commit crimes, but never got ill. So, I am trying to circumvent it somehow. Many thanks!

          Comment


          • #20
            This code does exclude any crimes committed before illness. Do you mean you want to exclude id's entirely if they have any crimes prior to illness?

            The following code will do that. I don't know if it is properly handling those id's who appear only in one of the data sets and not the other. The code keeps those observations in the data set. If that's not what you want, change the specification of the -unmatched()- option in the -joinby- command. See -help joinby- for how to do that for the way that you want the result to be.

            Code:
            by id (illdate), sort: replace illdate = illdate[1]
            gen lower = max(illdate+1, mdy(1, 1, yearexam))
            gen upper = mdy(12, 31, yearexam)
            rename yearexam year
            
            joinby id year using `crimes', unmatched(both)
            by id (year), sort: egen to_drop = max(crimedate <= illdate & !missing(illdate))
            drop if to_drop
            
            collapse (count) n_crimes_this_year_post_illness = crimedate ///
                (first) illdate score, by(id year)
            format n_crimes_this_year_post_illness %1.0f
            by id (year), sort: egen n_crimes_post_illness = ///
                total(n_crimes_this_year_post_illness)

            Comment


            • #21
              Thank you very much!
              Sorry, I didn't mean excluding obs, but just not counting the crimes before getting ill.
              Thank you so much!!

              Comment

              Working...
              X