In the survey data I am using for my regression, most of the variables have one or more of the following responses: Not applicable, Don't know, Refusal, which are coded as outliers.
E.g. one of my variables, self-reported job satisfaction is coded 1=very satisfied; 2=satisfied; 3=not very satisfied; 4=not at all satisfied; 8=DK/no opinion (spontaneous); 9=Refusal (spontaneous).
My question is how should I deal with such observations?
Should I just leave them as is, so they will be included in regression? or should I drop them from my dataset? or does it depend how large a proportion of observations from that variable such responses make up?
In the survey data I am using for my regression, most of the variables have one or more of the following responses: Not applicable, Don't know, Refusal, which are coded as outliers.
E.g. one of my variables, self-reported job satisfaction is coded 1=very satisfied; 2=satisfied; 3=not very satisfied; 4=not at all satisfied; 8=DK/no opinion (spontaneous); 9=Refusal (spontaneous).
My question is how should I deal with such observations?
Should I just leave them as is, so they will be included in regression? or should I drop them from my dataset? or does it depend how large a proportion of observations from that variable such responses make up?