Testing whether attrition is informative

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#16

19 Apr 2017, 03:58

Rose:
Q1: well, your missing data may (if MNAR) or not (if MAR) be informative.
Q2: -mi- is usually recommended for MAR; I agree about the biasedness of listwise deletion (but see: http://statisticalhorizons.com/listw...n-its-not-evil) and dummy variabe adjustment.

Kind regards,
Carlo
(Stata 19.0)
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#17

19 Apr 2017, 04:07

Bias only occurs if the probability of having a missing value is dependent on your dependent/explained/left-hand-side/y-variable. Otherwise list-wise deletion will lead to unbiased (but inefficient) estimates.

Dummy variable adjustment is only applicable in very special situation, where the missing values are not missing but just don't exist.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Rose Simmons

Join Date: Feb 2017

Posts: 114
#18

19 Apr 2017, 06:57

Carlo Lazzaro thanks for the link on listwise deletion, it was an interesting read.
Question 3: As it is not possible to test whether my data is MAR or NMAR, I don't see how I could use -mi- (which is for MAR)?
Question 4: If I do not use any methods to deal with missing values, and just leave them as "." in my dataset, is this acceptable?

Last edited by Rose Simmons; 19 Apr 2017, 07:00.
Comment
Rose Simmons

Join Date: Feb 2017

Posts: 114
#19

19 Apr 2017, 07:02

Originally posted by Maarten Buis View Post

Bias only occurs if the probability of having a missing value is dependent on your dependent/explained/left-hand-side/y-variable. Otherwise list-wise deletion will lead to unbiased (but inefficient) estimates.

Dummy variable adjustment is only applicable in very special situation, where the missing values are not missing but just don't exist.

Maarten Buis
With the mcartest, I have established that my data is either:
- MAR (missing y-values depend on x and not y)
- NMAR (probability of missing values depend on their own variable)
So does this mean that list-wise deletion will only lead to bias if the data is NMAR and missing y-values depend on the y-variable?

Also, what is meant by the missing values "just don't exist"?
Would an example of this be if a non-saver is asked how much they saved and the value there simply doesn't exist as they do not save?

Thanks
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#20

19 Apr 2017, 07:09

Rose:
Q3: your shareable statement works in the opposite direction as well: as you cannot be sure that your data are not MAR, why not using -mi- (and make same sensitivity analysis, that is producing different imputation models and presenting their results; by the way, sensitivity analysis is really relevant when you suspect that your data are NMAR);
Q4: you declare you have an unbalanced panel; nothing sinister, it happens.

Kind regards,
Carlo
(Stata 19.0)
Comment
Rose Simmons

Join Date: Feb 2017

Posts: 114
#21

19 Apr 2017, 07:26

Thank you Carlo Lazzaro I will do some reading up on using -mi- or sensitivity analysis. If, ultimately, I do not use these methods then I will declare that I have an unbalanced panel.

To summarise what we have discussed, would you agree with the following:
I have missing values in my dataset that are not MCAR (as shown by the mcartest).
Missing values cause bias in results when the data is actually missing (e.g. due to non-response), particularly if they lead to under-representation of groups in the sample.
To correct for this, -mi- or sensitivity analysis may be used, and will reduce bias caused by missingness.
However, if these methods are not used, then the panel is simply unbalanced (with bias caused by missing values) and analysis can still go ahead as Stata can handle unbalanced panels.

Many thanks
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#22

19 Apr 2017, 07:37

I disagree with Maarten Buis (#17); the missing indicator method can be applicable in randomized trials also; see Groenwold, RHH, et al. (2012), "Missing covariate Data in clinical research: when and when not to use the missing-indicator method for analysis," Canadian Medical Association Journal, 184(11): 1265-1269; I do agree that it is not appropriate here
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#23

19 Apr 2017, 07:46

Rose:
i agree with your wrap-up.
I would only turn the sentence

...-mi- or sensitivity analysis...

to

...-mi- and sensitivity analysis...

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#24

19 Apr 2017, 08:14

Originally posted by Rose Simmons View Post

Missing values cause bias in results when the data is actually missing (e.g. due to non-response), particularly if they lead to under-representation of groups in the sample.

Not necessarily, the bias will only occur when the probability of missingness depends on the dependent variable.

Originally posted by Rose Simmons View Post

To correct for this, -mi- or sensitivity analysis may be used, and will reduce bias caused by missingness.

-mi- may reduce bias but, like any other modeling step, can easily increase bias if you make an error. Sensitivity analysis won't reduce or increase bias, it will just let you know how much your estimates respond to different choices made by you.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Rose Simmons

Join Date: Feb 2017

Posts: 114
#25

19 Apr 2017, 08:31

Thank you all very much for your inputs

Originally posted by Maarten Buis View Post

Not necessarily, the bias will only occur when the probability of missingness depends on the dependent variable.

Just to confirm, is it possible to test whether this bias has occurred? Or is this not possible because the MAR vs NMAR test is not statistically possible?
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#26

19 Apr 2017, 08:38

If your y-variable is fully observed, yes, otherwise no.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment

Rose Simmons

Join Date: Feb 2017
Posts: 114

#27

19 Apr 2017, 08:43

Thank you Maarten Buis . Unfortunately I have missing values for my y-variable, so it will not be possible to test this.

Code:

. mdesc saving

    Variable    |     Missing          Total     Percent Missing
----------------+-----------------------------------------------
         saving |         266         13,217           2.01
----------------+-----------------------------------------------

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment