Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to deal with missing data?

    I've got country-level annualised data for the years 1980-2019 - it includes 90 countries.

    It effectively looks like this,
    Code:
    CTRY1 Y1 β1 β2
    CTRY1 Y2 β1 β2
    CTRY1 Y3 β1 β2
    CTRY2 Y1 β1 β2
    CTRY2 Y2 β1 β2
    CTRY2 Y3 β1 β2
    The issue is that for many countries there is data available only after, say, 2000, or any other such year. For example, Afghanistan only has data for 2008 and afterwards, whilst Angola has data for 1992 and afterwards.
    Having used "misstable summarize [explanatory var]" I have discovered that 15% of the data is missing.

    Is it wise to just drop the years for the specific countries where there is no data (i.e., drop Afghanistan pre-2008 and leave Afghanistan for 2008+), or would this create further bias?
    What is the suggested way I deal with this issue?
    Thanks in advance!

  • #2
    Max:
    welcome to this forum.
    Stata applies listwise deletion to observation with missing values in any variables.
    Otherwise you can consider -mi- if MAR (missing at random) is the mechanism underlying your unobserved values.
    More substantively, I would double-check whether panels with observations belonging to such different years can live together in the same dataset with no methodologically unexpected findings.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment

    Working...
    X