Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating cumulative exposure

    hi stata users,

    Appreciate your help in the matter below

    Data set 1- life history of occupations
    id/Individual occupation category Year start Year end
    C1 1 1945 1950
    C1 2 1950 1962
    C2 1 1965 1970
    C2 2 1971 1978
    C2 3 1979 1989

    Data set 2 occupations and gas
    Occupation category Probability of exposure to gas Mean level of exposure to gas Period
    1 70 0.1 1945-1959
    1 68 0.5 1960-1974
    1 68 0.5 1975-1984
    2 40 0.5 1945-1959
    2 65 0.5 1960-1974
    2 70 0.1 1975-1984
    3 40 0.5 1945-1959
    3 55 0.2 1960-1974
    3 24 0.3 1975-1984
    Hi all,

    I want to merge data set 1 with data set 2 and then write a command that allow me to generate ever/never exposure to gas variable and life time cumulative exposure to gas variable.


    Being ever exposed to gas is defined as having at least one occupation with a probability of exposure more than 50%. Individuals who worked only in a occupation with probability of exposure less than 50% to be considered unexposed.

    For life time exposure it should be for each time period as the products of probability of exposure, mean level of exposure and duration of occupation in that specific time period. Then exposures over all time periods to be summed for each person.

    many thanks for your help.

  • #2
    I think this does it:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2 id byte occupation_category int(year_start year_end)
    "C1" 1 1945 1950
    "C1" 2 1950 1962
    "C2" 1 1965 1970
    "C2" 2 1971 1978
    "C2" 3 1979 1989
    end
    
    //    PREPARE THIS DATA FOR JOINING WITH GAS DATA
    gen obs_no = _n
    expand year_end - year_start + 1
    by obs_no, sort: gen year = year_start + _n  - 1
    by obs_no: assert year_end == year[_N]
    drop year_start year_end 
    duplicates drop
    sort id occupation_category year
    tempfile occ_histories
    save `occ_histories'
    
    //    NOW PREPARE THE OCCUPATION-GAS DATA FOR JOINING WITH OCCUPATIONAL HISTORIES
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(occupation_category prob_exposure) float mean_exposure str9 period
    1 70 .1 "1945-1959"
    1 68 .5 "1960-1974"
    1 68 .5 "1975-1984"
    2 40 .5 "1945-1959"
    2 65 .5 "1960-1974"
    2 70 .1 "1975-1984"
    3 40 .5 "1945-1959"
    3 55 .2 "1960-1974"
    3 24 .3 "1975-1984"
    end
    split period, gen(year) parse("-") destring
    drop period
    
    //    NOW JOIN THE TWO DATA SETS
    rangejoin  year year1 year2 using `occ_histories', by(occupation_category)
    drop if missing(year)
    
    //    NOW CALCULATE DESIRED RESULTS
    by id, sort: egen ever_exposed = max(prob_exposure >= 0.5)
    by id: egen lifetime_exposure = total(prob_exposure*mean_exposure)
    
    //    IF YOU WANT TO GO BACK TO THE ORIGINAL DATA LAYOUT WITH
    //    ONE OBSERVATION PER INTERVAL, THEN DO THIS:
    collapse (min) year_start = year (max) year_end = year (first) ever_exposed ///
        lifetime_exposure, by(obs_no)
    Notes:
    1. Requires Robert Picard's -rangejoin- command, which you can get by running -ssc install rangejoin-.
    2. You defined ever-exposed as ever working in an occupation category with probability of exposure greater than 0.5. But in your data set, there are no such occupation categories. I took the liberty of assuming you really meant greater than or equal to 0.5. If that's not right, change the code accordingly.

    In the future, please do not use HTML tables to show example data: it can be very difficult to import these into Stata. Moreover, your table column headers are not even legal Stata variable names. If you are going to work in Stata, you must bring your data into Stata data sets first. So you should show us the data as you have it in Stata. The best way to do that is with the -dataex- command, which you can install by running -ssc install dataex-. It is very easy to use: for instructions, run -help dataex-. This is what I have done with your example data: notice that with a simple copy and paste operation you can immediately and faithfully replicate the data exactly as I had them in my Stata.

    Comment


    • #3
      thanks a lot for your feedback.

      Comment

      Working...
      X