Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating Dummy for Imputed Values

    Dear Statalisters,

    my control variables (CV) contain a high number of missing. For this reason, I will create a table reporting three columns: coefficients without CV, with CV (and a low number of observations) as well as with imputed values for the CV.
    For the last column, I will impute the mean on these controls and include a dummy variable for imputation in the model.

    I have already imputed the values with the command below. Note that IDescola means ID of the school.

    Code:
    foreach var of varlist $controlvar2 { 
          egen mean = mean(`var'), by(IDescola) 
          gen `var'imp = cond(missing(`var'), mean, `var') 
          drop mean
    }
    Now I would like to create a dummy indicating if the observation (row) contains any imputation for the CV. In other others: My command should look all the variables in $controlvar2 and report 1 if it finds any imputed value for the observation/row.

    Does anyone have any idea of how can I create this dummy?
    Any advice would be highly appreciated!
    Thanks in advance.

  • #2
    Hi Tharcisio, you could try

    Code:
    egen nmiss = rowmiss($controlvar2)
    gen anymiss = nmiss > 0
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      Tharcisio:
      as an aside to Felixìs helpful code, do you think that imputing the mean of the observed values of the CV is the most reliable strategy to deal with missing data?
      Have you performed a diagnostic check on the underlying missing mechanism?
      Is your missingness ignorable or not?
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Hello Everybody,

        @Felix: Thanks !! A simple and effective code.

        @Carlo: Thanks for your feedback.
        1. Do you think that imputing the mean of the observed values of the CV is the most reliable strategy to deal with missing data?
        No, I would prefer to use the Multiple Imputation (MI). But the reviewer of my paper asked me for the mean. For this reason, I am using the mean imputation.
        2. Have you performed a diagnostic check on the underlying missing mechanism?
        ​​​​​​​
        Not yet, but I will do that. I will include a table in appendix comparing the final sample (with CV and missing) with the whole sample (without CV and missing). Is that what do you mean with "diagnostic check"?

        Thanks.

        Comment


        • #5
          Tharcisio:
          1) just to save (as per a famous Italian saying) the goat (reviewers' recommendation that should obviously be satisfied) and the cabbages (the methodological standing of your research), can't you present both mean imputation (that, as we know, underestimates the variance) and a (simple) -mi- (provided that your data are MAR)?
          2) no, I meant to diagnosis if your data are missing completely at arandom (MCAR), missing at random (MAR) or missing non at random (MNAR).
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Carlo Lazzaro ,
            Yes, I will check this assumption later using -mcartest. Would you recommend this command as well?

            Comment


            • #7
              Tharcisio:
              yes, I use it from time to time.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment

              Working...
              X