Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making a health index with probit models and normalization

    Hi.

    I am currently trying to make a health index. you see my data below, being panel data sort by the id and then I have one ordered variable taking values from 1-5 where 1 is "excellent health" and 5 is "poor" health.

    the rest of the variables Is binary variables taking the value 1 if the respondent has the condition and taking the value 0 if the respondent don't have the condition. I am trying to do the following

    1. Estimate ten probit models each one with a different health condition on the lefthand-side and the remaining nine health conditions on the righthand-side.

    2. For each probit model and for each observation, generate a prediction of the dependent variable

    3. for each probit model, normalize the prediction to lie between zero and one, where the outcomes closer to one indicate better health.

    4. For each observation, average across all ten prediction to calculate the observations final health index.

    what I do is I use
    Code:
    oprobit
    where my dependent variable is 1-5 indicator and
    Code:
    probit
    on the binary variables and then predict using the predict command afterwards to obtain the predicted probabilities.

    I then normalize the predictions using

    Code:
    su p_Self_reported_health, meanonly 
    gen normal_p_Self_reported_health = (p_Self_reported_health - r(min)) / (r(max) - r(min))
    i do this for all the probit models. I then take the average for all 10 normalizations. The problem and question is firstly if the approach is correct.

    Then my problem is that the normalization should be done in a way where values closer to one means better health. So right now one variable is 1-5 were 1 is excellent health and 5 is poor health and the binary variables is 0 if you don't have the bad health condition and 1 is if you have the bad conditions, so how do I make sure that the values closest to one is the healthiest respondents?


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(HHIDPN Self_reported_health High_blood_pressure Diabetes Cancer Lung_problems Heart_problems Stroke Psychological_problems Arthritis) float obese
        3010 4 0 0 0 0 1 0 0 0 0
        3010 4 0 0 0 0 0 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 4 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
        3010 3 0 0 0 0 1 0 0 0 0
    10013010 1 0 0 0 0 0 0 0 0 0
    10013010 3 1 0 0 0 1 0 0 0 0
    10013010 3 1 0 0 0 1 0 1 0 0
    10013010 5 1 0 0 0 1 0 1 0 1
    10013010 4 1 0 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 0
    10013010 3 1 1 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 1
    10038010 1 1 0 0 0 0 0 0 0 0
    10038010 2 1 0 0 0 0 0 0 0 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10063010 4 1 1 0 0 0 0 1 1 0
    10063010 5 1 1 0 0 0 0 1 1 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10097010 3 1 0 0 0 0 0 0 0 0
    10097010 3 1 0 0 0 0 0 0 0 0
    10114010 4 1 0 0 0 0 0 0 1 0
    10114010 2 1 0 0 0 0 0 0 1 0
    10114010 3 1 0 0 0 0 0 0 0 0
    10210010 5 1 1 0 0 0 0 0 0 0
    10210010 5 1 1 0 0 0 1 0 0 0
    10210010 5 1 1 0 0 0 1 0 1 0
    10210010 5 1 1 0 0 0 1 1 1 0
    10210010 5 1 1 0 0 0 1 1 1 0
    10210010 5 1 1 0 0 1 1 1 1 0
    10225010 2 0 0 0 0 0 0 0 0 1
    10225010 5 1 0 0 0 1 0 0 0 1
    10225010 4 1 0 0 0 1 0 0 0 1
    10225010 3 1 0 0 0 1 0 1 0 1
    10225010 3 1 0 0 0 1 0 0 1 1
    end
    I hope you can help me

    Kind regards

    Mads


  • #2
    Originally posted by Mads Funder Berg View Post
    Hi.

    I am currently trying to make a health index. you see my data below, being panel data sort by the id and then I have one ordered variable taking values from 1-5 where 1 is "excellent health" and 5 is "poor" health.

    the rest of the variables Is binary variables taking the value 1 if the respondent has the condition and taking the value 0 if the respondent don't have the condition. I am trying to do the following

    1. Estimate ten probit models each one with a different health condition on the lefthand-side and the remaining nine health conditions on the righthand-side.

    2. For each probit model and for each observation, generate a prediction of the dependent variable

    3. for each probit model, normalize the prediction to lie between zero and one, where the outcomes closer to one indicate better health.

    4. For each observation, average across all ten prediction to calculate the observations final health index.

    what I do is I use
    Code:
    oprobit
    where my dependent variable is 1-5 indicator and
    Code:
    probit
    on the binary variables and then predict using the predict command afterwards to obtain the predicted probabilities.

    I then normalize the predictions using

    Code:
    su p_Self_reported_health, meanonly
    gen normal_p_Self_reported_health = (p_Self_reported_health - r(min)) / (r(max) - r(min))
    i do this for all the probit models. I then take the average for all 10 normalizations. The problem and question is firstly if the approach is correct.

    Then my problem is that the normalization should be done in a way where values closer to one means better health. So right now one variable is 1-5 were 1 is excellent health and 5 is poor health and the binary variables is 0 if you don't have the bad health condition and 1 is if you have the bad conditions, so how do I make sure that the values closest to one is the healthiest respondents?


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(HHIDPN Self_reported_health High_blood_pressure Diabetes Cancer Lung_problems Heart_problems Stroke Psychological_problems Arthritis) float obese
    3010 4 0 0 0 0 1 0 0 0 0
    3010 4 0 0 0 0 0 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 4 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    3010 3 0 0 0 0 1 0 0 0 0
    10013010 1 0 0 0 0 0 0 0 0 0
    10013010 3 1 0 0 0 1 0 0 0 0
    10013010 3 1 0 0 0 1 0 1 0 0
    10013010 5 1 0 0 0 1 0 1 0 1
    10013010 4 1 0 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 0
    10013010 3 1 1 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 1
    10013010 4 1 1 0 0 1 0 1 0 1
    10038010 1 1 0 0 0 0 0 0 0 0
    10038010 2 1 0 0 0 0 0 0 0 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 0 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10038010 2 1 0 0 0 1 0 0 1 0
    10063010 4 1 1 0 0 0 0 1 1 0
    10063010 5 1 1 0 0 0 0 1 1 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10063010 4 1 1 0 0 0 0 0 0 0
    10097010 3 1 0 0 0 0 0 0 0 0
    10097010 3 1 0 0 0 0 0 0 0 0
    10114010 4 1 0 0 0 0 0 0 1 0
    10114010 2 1 0 0 0 0 0 0 1 0
    10114010 3 1 0 0 0 0 0 0 0 0
    10210010 5 1 1 0 0 0 0 0 0 0
    10210010 5 1 1 0 0 0 1 0 0 0
    10210010 5 1 1 0 0 0 1 0 1 0
    10210010 5 1 1 0 0 0 1 1 1 0
    10210010 5 1 1 0 0 0 1 1 1 0
    10210010 5 1 1 0 0 1 1 1 1 0
    10225010 2 0 0 0 0 0 0 0 0 1
    10225010 5 1 0 0 0 1 0 0 0 1
    10225010 4 1 0 0 0 1 0 0 0 1
    10225010 3 1 0 0 0 1 0 1 0 1
    10225010 3 1 0 0 0 1 0 0 1 1
    end
    I hope you can help me

    Kind regards

    Mads

    I would also like to know how else this statistical information can help in nursing. At https://papersowl.com/examples/nursing/ I read a lot of information about nursing and how the latest technologies, in particular in statistics, affect the development of this area.
    I am also interested in solving the problem. Thank you for your inquiry.

    Comment

    Working...
    X