Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating measures of alcohol consumption (drinks/week, g of ethanol per day etc)

    Hello Statalisters,

    I am trying to calculate cumulative measures of alcohol consumption in my l dataset but its getting a bit complicated than smoking measures calculation. I have 9 episodes of drinking habit captured on all participants over their whole life in my data set (Sample data set given below) , with data on

    a) Type of beverage they have used (1=wine, 2=beer, 3= hard liq, 4= Aperitif),
    b) Start age of consumption in each episode, stop age of consumption in each episode,
    c) Unit of alcohol consumed (1= 50 ml, 2= 100ml, 3=250ml, 4=330 ml, 5=725ml),
    d) Consumption (how many units) ,
    e) Frequency (1= per day, 2=per week, 3= per month).

    Finally I want to calculate multiple measures of alcohol from this data. I want Number of Standard drinks of ethanol a) per day, b) per week, c) whole life , all in ml. I want number of grams of ethanol a) per day, b) per week, c) whole life. 1 Standard drink = 17.05 ml of ethanol or 13.5 g of ethanol. There is 0.789 g of ethanol per ml. It will be great if I could get these measures for all a) types of beverages combined b) each beverage separately (wine, beer, hard liq, Aperitif).

    Codes that I tried out and Sample dataset with data on 11 participants are given below.

    Two the scenarios that can complicate this calculation are:
    a) A person might use multiple types of beverages and hence might have overlapping age periods (from and to ages) among any of the 9 episodes (e.g., 001-2, 1st episode is charecterised by this person drinking 2 medium glasses of wine/ month from 20 to 46 yrs of age. 2nd episode is characterised by 3 medium glasses of wine/week between 47 to 62 yrs. 3rd episode, 5 medium glasses of wine/week from 63 to 65 years. Now in between, this person has used 5 medium glasses of Aperitif per week between 60 and 65 years as well (4th episode).

    b) A person might have discontinuous episodes of drinking over life time (e.g., 004-1, episode 1 is from 18 to 29 years. then this person doesn't drink between 30 and 34 years. restarts drinking from 35 years. Such periods of abstinence (30-34 years) has to be deducted from total years of usage .

    (The original data was in wide format. I have converted into long format for easier calculation)

    Code:
    /RESHAPING WIDE TO LONG
    reshape long @_bev_type @_from @_to @_unit @_Howmanyunit @_per, i(id) j(episode) // converting wide data to long
    mvdecode bev_type from to unit Howmanyunit per, mv (88 888)  // converting 88 and 888 to missing values
    recode bev_type from to unit Howmanyunit per(. = 0) // converting all missing values to 0's
    
    assert inlist(bev_type, 1, 2, 3, 4) if !missing(bev_type)
    gen bev_type
    label define type    1    "Wine" ///
                        2    "Beer" ///
                        3    "Hard liquor" ///
                        4    "Aperitif" ///
           
    label values type type
    drop _type
    rename _* *
    
    /  EXPANDING DATA TO ONE OBSERVATION PER YEAR
    assert missing(from, to) if unit == 0
    drop if unit== 0 // These have no information on duration
    gen long obs_no = _n
    expand to - from + 1
    by obs_no, sort: gen current_age = from + _n - 1
    by obs_no (current_age), sort: assert current_age[_N] == to
    I am stuck after this how to calculate alcohol equivalency, convert per day, per week per month to per day, collapsing summary record to each person etc

    Any help is appreciated.



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str5 id byte(episode bev_type) int(from to) byte(unit Howmanyunits per)
    "001-2" 1 1 20 46 2  2 3
    "001-2" 2 1 47 62 2  3 2
    "001-2" 3 1 63 65 2  5 2
    "001-2" 4 4 60 65 2  5 2
    "001-2" 5 0  0  0 0  0 0
    "001-2" 6 0  0  0 0  0 0
    "001-2" 7 0  0  0 0  0 0
    "001-2" 8 0  0  0 0  0 0
    "001-2" 9 0  0  0 0  0 0
    "002-2" 1 0  0  0 0  0 0
    "002-2" 2 0  0  0 0  0 0
    "002-2" 3 0  0  0 0  0 0
    "002-2" 4 0  0  0 0  0 0
    "002-2" 5 0  0  0 0  0 0
    "002-2" 6 0  0  0 0  0 0
    "002-2" 7 0  0  0 0  0 0
    "002-2" 8 0  0  0 0  0 0
    "002-2" 9 0  0  0 0  0 0
    "003-2" 1 2 20 43 4  6 1
    "003-2" 2 0  0  0 0  0 0
    "003-2" 3 0  0  0 0  0 0
    "003-2" 4 0  0  0 0  0 0
    "003-2" 5 0  0  0 0  0 0
    "003-2" 6 0  0  0 0  0 0
    "003-2" 7 0  0  0 0  0 0
    "003-2" 8 0  0  0 0  0 0
    "003-2" 9 0  0  0 0  0 0
    "004-1" 1 1 18 29 2  3 2
    "004-1" 2 1 35 72 2  4 2
    "004-1" 3 3 40 72 2  2 2
    "004-1" 4 0  0  0 0  0 0
    "004-1" 5 0  0  0 0  0 0
    "004-1" 6 0  0  0 0  0 0
    "004-1" 7 0  0  0 0  0 0
    "004-1" 8 0  0  0 0  0 0
    "004-1" 9 0  0  0 0  0 0
    "004-2" 1 1 25 56 2  1 1
    "004-2" 2 2 25 56 4  1 2
    "004-2" 3 0  0  0 0  0 0
    "004-2" 4 0  0  0 0  0 0
    "004-2" 5 0  0  0 0  0 0
    "004-2" 6 0  0  0 0  0 0
    "004-2" 7 0  0  0 0  0 0
    "004-2" 8 0  0  0 0  0 0
    "004-2" 9 0  0  0 0  0 0
    "007-2" 1 1 28 33 2  5 2
    "007-2" 2 1 34 50 2 10 2
    "007-2" 3 1 51 67 2 15 2
    "007-2" 4 1 68 69 2  5 2
    "007-2" 5 2 28 58 4  1 1
    "007-2" 6 2 59 69 4  2 2
    "007-2" 7 0  0  0 0  0 0
    "007-2" 8 0  0  0 0  0 0
    "007-2" 9 0  0  0 0  0 0
    "008-1" 1 2 34 55 4  1 1
    "008-1" 2 2 56 59 4  3 1
    "008-1" 3 2 20 33 4 15 2
    "008-1" 4 1 26 30 2  2 3
    "008-1" 5 0  0  0 0  0 0
    "008-1" 6 0  0  0 0  0 0
    "008-1" 7 0  0  0 0  0 0
    "008-1" 8 0  0  0 0  0 0
    "008-1" 9 0  0  0 0  0 0
    "008-2" 1 0  0  0 0  0 0
    "008-2" 2 0  0  0 0  0 0
    "008-2" 3 0  0  0 0  0 0
    "008-2" 4 0  0  0 0  0 0
    "008-2" 5 0  0  0 0  0 0
    "008-2" 6 0  0  0 0  0 0
    "008-2" 7 0  0  0 0  0 0
    "008-2" 8 0  0  0 0  0 0
    "008-2" 9 0  0  0 0  0 0
    "009-2" 1 1 20 40 2  4 2
    "009-2" 2 2 20 49 4  4 2
    "009-2" 3 1 41 72 2  1 1
    "009-2" 4 3 50 65 1  1 1
    "009-2" 5 0  0  0 0  0 0
    "009-2" 6 0  0  0 0  0 0
    "009-2" 7 0  0  0 0  0 0
    "009-2" 8 0  0  0 0  0 0
    "009-2" 9 0  0  0 0  0 0
    "010-1" 1 2 16 18 4  8 2
    "010-1" 2 2 19 41 4  2 2
    "010-1" 3 0  0  0 0  0 0
    "010-1" 4 0  0  0 0  0 0
    "010-1" 5 0  0  0 0  0 0
    "010-1" 6 0  0  0 0  0 0
    "010-1" 7 0  0  0 0  0 0
    "010-1" 8 0  0  0 0  0 0
    "010-1" 9 0  0  0 0  0 0
    "010-2" 1 1 22 76 3  2 2
    "010-2" 2 0  0  0 0  0 0
    "010-2" 3 0  0  0 0  0 0
    "010-2" 4 0  0  0 0  0 0
    "010-2" 5 0  0  0 0  0 0
    "010-2" 6 0  0  0 0  0 0
    "010-2" 7 0  0  0 0  0 0
    "010-2" 8 0  0  0 0  0 0
    "010-2" 9 0  0  0 0  0 0
    "011-2" 1 2 15 18 4  5 2
    "011-2" 2 2 19 50 4  3 1
    "011-2" 3 1 21 50 2  2 1
    "011-2" 4 0  0  0 0  0 0
    "011-2" 5 0  0  0 0  0 0
    "011-2" 6 0  0  0 0  0 0
    "011-2" 7 0  0  0 0  0 0
    "011-2" 8 0  0  0 0  0 0
    "011-2" 9 0  0  0 0  0 0
    end

  • #2
    So I think what you need is to do is first convert all of the values of Howmanyunits to their daily equivalents:

    Code:
    replace Howmanyunits = Howmanyunits * 7 if per == 2
    replace Howmanyunits = Howmanyunits * 30 if per == 3
    drop per
    Next you need to convert all of the units to a single denomination, probably 50 ml.

    Code:
    replace Howmanyunits = Howmanyunits * 2 if unit == 2
    replace Howmanyunits = Howmanyunits * 5 if unit == 3
    replace Howmanyunits = Howmanyunits * 330/50 if unit == 4
    replace Howmanyunits = Howmanyunits * 725/50 if unit == 5
    drop unit
    At this point you have calculated how many 50ml aliquots of beverage are consumed per day.

    The final step is to calculate grams of alcohol. That cannot be done with the information provided in your post. You need conversion factors to say how many grams of alcohol are in 50 ml of wine, how many in 50 ml of beer, etc. Just for the sake of example, if 50 ml of wine contains 5.2 g of alcohol (just a made up number):

    Code:
    gen gm_etoh_per_day = 5.2*Howmanyunits if bev_type == "Wine":type
    and a similar command for each beverage type using the appropriate number of grams of alcohol per 50 ml. If you don't know what those conversion factors are, I'm confident you can find them somewhere on the internet.

    Comment


    • #3
      Hi Clyde,
      Thank you for the quick reply. I was thinking of using the conversion factor based on number of standard drinks. As i mentioned in #1, 1 standard drink is 17.05 ml of ethanol =13.5g of ethanol. So I can calculate number of standard drinks of ethanol / day or per week, right?
      Also how can i calculate total duration of active alcohol consumption over lifetime? (taking into consideration there will be over laps in episodes, periods of abstinence etc) ?

      Comment


      • #4
        So I can calculate number of standard drinks of ethanol / day or per week, right?
        Yes, it's just a matter of finding the appropriate conversion factors between 50ml of wine, beer, etc. and standard drinks.

        Also how can i calculate total duration of active alcohol consumption over lifetime? (taking into consideration there will be over laps in episodes, periods of abstinence etc) ?
        To simplify it, I would first reduce my file to one containing the id, episode number, and to and from. From there it's a matter of organizing the data chronologically and creating spells of drinking, then adding up the durations of those spells.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str5 id byte episode int(from to)
        "001-2" 1 20 46
        "001-2" 2 47 62
        "001-2" 3 63 65
        "001-2" 4 60 65
        "001-2" 5  0  0
        "001-2" 6  0  0
        "001-2" 7  0  0
        "001-2" 8  0  0
        "001-2" 9  0  0
        "002-2" 1  0  0
        "002-2" 2  0  0
        "002-2" 3  0  0
        "002-2" 4  0  0
        "002-2" 5  0  0
        "002-2" 6  0  0
        "002-2" 7  0  0
        "002-2" 8  0  0
        "002-2" 9  0  0
        "003-2" 1 20 43
        "003-2" 2  0  0
        "003-2" 3  0  0
        "003-2" 4  0  0
        "003-2" 5  0  0
        "003-2" 6  0  0
        "003-2" 7  0  0
        "003-2" 8  0  0
        "003-2" 9  0  0
        "004-1" 1 18 29
        "004-1" 2 35 72
        "004-1" 3 40 72
        "004-1" 4  0  0
        "004-1" 5  0  0
        "004-1" 6  0  0
        "004-1" 7  0  0
        "004-1" 8  0  0
        "004-1" 9  0  0
        "004-2" 1 25 56
        "004-2" 2 25 56
        "004-2" 3  0  0
        "004-2" 4  0  0
        "004-2" 5  0  0
        "004-2" 6  0  0
        "004-2" 7  0  0
        "004-2" 8  0  0
        "004-2" 9  0  0
        "007-2" 1 28 33
        "007-2" 2 34 50
        "007-2" 3 51 67
        "007-2" 4 68 69
        "007-2" 5 28 58
        "007-2" 6 59 69
        "007-2" 7  0  0
        "007-2" 8  0  0
        "007-2" 9  0  0
        "008-1" 1 34 55
        "008-1" 2 56 59
        "008-1" 3 20 33
        "008-1" 4 26 30
        "008-1" 5  0  0
        "008-1" 6  0  0
        "008-1" 7  0  0
        "008-1" 8  0  0
        "008-1" 9  0  0
        "008-2" 1  0  0
        "008-2" 2  0  0
        "008-2" 3  0  0
        "008-2" 4  0  0
        "008-2" 5  0  0
        "008-2" 6  0  0
        "008-2" 7  0  0
        "008-2" 8  0  0
        "008-2" 9  0  0
        "009-2" 1 20 40
        "009-2" 2 20 49
        "009-2" 3 41 72
        "009-2" 4 50 65
        "009-2" 5  0  0
        "009-2" 6  0  0
        "009-2" 7  0  0
        "009-2" 8  0  0
        "009-2" 9  0  0
        "010-1" 1 16 18
        "010-1" 2 19 41
        "010-1" 3  0  0
        "010-1" 4  0  0
        "010-1" 5  0  0
        "010-1" 6  0  0
        "010-1" 7  0  0
        "010-1" 8  0  0
        "010-1" 9  0  0
        "010-2" 1 22 76
        "010-2" 2  0  0
        "010-2" 3  0  0
        "010-2" 4  0  0
        "010-2" 5  0  0
        "010-2" 6  0  0
        "010-2" 7  0  0
        "010-2" 8  0  0
        "010-2" 9  0  0
        "011-2" 1 15 18
        end
        
        drop if from == 0 & to == 0
        assert from <= to
        rename from timefrom
        rename to timeto
        reshape long time, i(id episode) j(fromto) string
        sort id time
        by id (time): gen inspell = sum(fromto == "from") - sum(fromto == "to")
        replace inspell = 1 if inspell > 1
        by id (time): assert inspell == 0 if _n == _N
        
        by id (time): gen spell_num = sum(inspell == 1 & inspell[_n-1] != 1)
        by id spell_num (time), sort: gen spell_duration = time[_N] - time[1] if _n == 1
        collapse (sum) total_duration = spell_duration, by(id)

        Comment


        • #5
          Thankyou Clyde!

          Also there was an issue of ethanol equivalency I mentioned in #1. What I forgot to mention is the ethanol equivalency for each type of alcohol. Ethanol concentration is 5% for beer, 10% for wine and aperitif and 50% for hardliquor. My apologies for that but its an important point I missed out for converting each type to ethanol. And I am stuck.

          In total what I have ran till now ( using what you suggested) on the data in #1 is (excluding calculation of duration)

          Code:
          reshape long bev_type from to unit Howmanyunits per, i(id) j(episode)  // converting wide data to long
          mvdecode bev_type from to unit Howmanyunits per,, mv (88 888)  // converting 88 and 888 to missing values
          recode bev_type from to unit Howmanyunits per,(. = 0)
          
          *First convert all of the values of Howmanyunits to their daily equivalents:
          replace Howmanyunits = Howmanyunits * 7 if per == 2
          replace Howmanyunits = Howmanyunits * 30 if per == 3
          drop per
          
          *Need to convert all of the units to a single denomination, probably 50 ml.
          replace Howmanyunits = Howmanyunits * 2 if unit == 2
          replace Howmanyunits = Howmanyunits * 5 if unit == 3
          replace Howmanyunits = Howmanyunits * 330/50 if unit == 4
          replace Howmanyunits = Howmanyunits * 725/50 if unit == 5
          drop unit
          
          *At this point you have calculated how many 50ml aliquots of beverage are consumed per day.
          
          gen total_ml_perday =g03alc_many*50 //  total ml consumed per day=

          But in between I have to fit in the ethanol equivalency, as I mentioned above. Otherwise my cumulative measures will be wrong. Can you help me with this?

          I gave a try with the following code with the assert inlist (for creating equivalency) but its giving me errors.


          Code:
          assert inlist(bev_type, 1, 2, 3, 4) if !missing(bev_type)
          gen bev= bev_type
          label define type    1    "Wine" ///
                              2    "Beer" ///
                              3    "Hard liquor" ///
                              4    "Aperitif" ///
                 
          label values bev _bev

          Also,finally, after I calculate the cumulative measures (e.g., using appropriate ethanol in g, number of standard drinks / day or week conversions) I have to collapse data to each observation to merge the data set with the original wide data.

          Comment


          • #6
            Hi Clyde,
            I think I fixed the issues mentioned in #5. But I ran into some issues with the codes you suggested in #4 for duration.

            For 001-2, comparing all episodes,this person used alcohol continuously from 20 to 65 yrs of age (Epi1= 20,46, Epi2=47-62, Epi3=63-65, Epi4=60-65). Episode 4 is within Episode 2 and 3. My hand calculations gives the duration to be 65-20=45 yrs. Your codes gives me 44. Issue

            For 003-2, hand calculation gives me 43-20=23. yours gives me 23. No issues.

            For 004-1: (29-18) + (72-35)=48. Yours gives me 48. No issues.

            004-2: 56-25=31. Yours = 31. No issues
            007-2: This person, irrespective of beverage type, used alcohol continuously from 28 to 69 years. So 69-28=41 years. Yours = 41 yrs. No issue
            008-1: 34-55, 56 to 59, 20-33, 26 - 30. So effectively 59-20= 39 yrs. yours gives 37. Issue
            009-2 : 52 yrs and 52 years. no issues
            010-1: 16-18, 19-41, 22-76. So effectively from 16 to 76=60 yrs. your calculation gives me 54 yrs. Issue
            011-2 3 yrs vs 3 yrs: no issues

            Whats causing this inconsistency between hand calculation and your suggested codes for some but not all observations?

            Comment


            • #7
              The difference is in the interpretation of from and to. Let's look at 001-2 for an example. You are saying this person used alcohol continuously from 20 to 65, which you then call a total of 45 years. But that is not what I see in the data. I see a person who uses alcohol continuously from 20 up to 46 (26 years), then abstains for a year, begins again at 47 and continues up to 65 (18 years), for a total of 26 + 18 = 44. All of our discrepancies can be explained in this way.

              Now, you generated this data (or got it from somebody who did) so you would be best positioned to interpret these issues. But let me point out that if 20-46 followed by 47-65 does not mean that at age 46 the person was abstinent (i.e., if the from-to pair includes both the from year and the to year), then calculating the duration by subtracting to - from is incorrect. If both endpoints are included, then drinking from 20 through 65 is (65-20) + 1 = 46 years. You can decide that "to" is included, or you can decide that it isn't, but you can't have it both ways.
              Last edited by Clyde Schechter; 14 Jul 2016, 07:42.

              Comment


              • #8
                By the way, if, in the end you decide that from and to are both included, then the fix to my code is simple. After the command -assert from <= to-, just insert

                Code:
                replace to = to + 1

                Comment


                • #9
                  Thanks a lot for the explanation Clyde. From the way the data was collected. I would want the end points to be included. So it will be duration = to-from+1, right ? Where can we make this change in the code? At present, in total what I am doing to get the total ethanol and duration of consumption (starting from converting wide to long data is calculate total ethanol (as in #2) per observation, and duration per observation (as in #4) separately and finally merge both Stata files.Although it works, its a longer coding and I guess there will be a way we can do this both in a single file (than merge both)..

                  Comment


                  • #10
                    Sorry Clyde, I didn't see you message #8. Thanks

                    Comment

                    Working...
                    X