It is understood that the number of observations in a regression analysis will vary depending on variables with missing values. With a goal of optimizing the number of non-missing values in a regression, one approach is to run the analysis adding or dropping one variable at a time to see which variables result in a significant drop in the sample size. However, the combination of variables included is also relevant. Can a subset of variables be identified a priori from a list of candidate predictors that would have a relatively high number or percentage of non-missing observations when included in a regression analysis? We could specify the number or percentage as a goal.
Or, more generally, is a command or procedure available that lists the number of non-missing observations for each combination of variables from a list?
Or, more generally, is a command or procedure available that lists the number of non-missing observations for each combination of variables from a list?
Comment