Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dummy variable with missing values

    Hi Statalist,

    As predicted by Carlo Lazzaro in a previous post, I have some difficulties in dealing with missing values.

    For example, the variable money_for_medicine is an answer to the question whether the person has money to buy medicine. The answers are 1 = yes (8000 observation), 2= no (70000 observation) and (.) 1/3 of observations missing .

    I wish to create a dependent variable, to see what the determinants are when people answer the question as 'no there is not enough money' - If I was to use it as an independent variable, I have seen that one could use factor variables.

    The problem is that I don't want to drop the missing values from my data set. But how can I make it a dummy so that it will only compare the answers yes and no.


    Many thanks,

    Rogier
    Last edited by Rogier Jansen; 25 Jul 2018, 06:26.

  • #2
    Hi Rogier,
    If you want to use money_for_medicine as your dependent variable and do not want to drop your missing values, you can recode those missing to another value (say, 3) and use a multinomial logit to compare the effects of your independent variables on the likelihood of observing an outcome of "yes", "no", "missing". You might have some hypotheses about why those values are missing (maybe older people are less likely to want to answer the question). Or you might find that the predictors of the missing value cases are more similar to those who say "no" than those who say "yes" because those who do not answer don't have the money and are embarrassed to say so.

    In any case, a simple recode command will allow you to do this:
    Code:
    recode money_for_medicine (missing=3), gen(new_money_med)
    Then you can run the mlogit command to analyze your data.
    Code:
    mlogit new_money_med iv1 iv2...
    The -baseoutcome()- option in mlogit will allow you to specify the comparison group of the dependent variable.


    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      I interpret Rogier's question a little more simply. I think he wants to omit observations with missing values for money_for_medicine from the model, but not by using drop to eliminate them from the dataset.

      Stata will do that automatically. Any observation with a missing value for any variable in the model will be ignored.

      But you do have a problem in that for a logistic regression, you need a dependent variable that takes the value 0 for a "negative outcome" - in your case, "yes", which is coded as 1 rather than 0.

      I would create a new variable as Carole did, but define it a little differently.
      Code:
      recode money_for_medicine (1=0) (2=1) (.=.), generate(new_money_med)
      Then if new_money_med is the dependent variable, observations with a missing value will be omitted, and the "positive outcome" will be "no money for medication".

      With that said, I think Carole's approach is a better statistical analysis than ignoring 1/3 of your observations.

      Comment


      • #4
        Rogier:
        I would first investigate the mechanism underlying the missingness of 1/3 of your cases for that variable.
        Probably persons who reported missing values are those who do not need drugs (if included in the sample) or those who cannot afford them.
        Hence, I would try to track down, with the support of the observed values for other covariates, whether you data are missing at random or not and deal with them accordingly.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks for your help!

          I indeed used the recode function in that manner.

          Really useful function mlogit.

          And thanks for the added insight for my analysis.

          Most of the missing values are because the household head was not interviewed in the individual questionnaire - which is where most of the socio-demographic variables come from.
          Last edited by Rogier Jansen; 27 Jul 2018, 05:29.

          Comment

          Working...
          X