Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata commands to evaluate interpolated variables

    Hi all. I have quarterly time series data (from 1999 to 2023) on two variables: the percentage of party members of a Brazilian political party and also the percentage of non-party members in the population. Because the original values for this party and the non-party members contained missing values for some quarters, I used interpolation methods to fill the missing observations. I used Nick Cox's "mipolate" command for that. The methods I used: linear, cubic, natural cubic spline and piecewise cubic Hermite. Now I want to compare them to verify which of the four variables is more appropriate for time series analysis. Are there established methods to compare interpolated series? Are there Stata commands to compare interpolated series? Any suggestion would be much appreciated. Bruno H.

  • #2
    I think there's something called "leave-one-out cross-validation". You make missing known values and then check which method produces the best estimate (RMSE or some such) of the known value.

    Comment


    • #3
      Interpolation is an entertaining programming problem, but its statistical uses are to me more problematic. You can't easily add back information that has been subtracted by loss of data, and it's worse if data are not available for specific economic or other reasons.

      Interpolated data are usually smoother than the original data would have been, but that's not guaranteed either/

      I suppose one test of any method is to delete "known" values arbitrarily and see which interpolation method comes closest in predicting them, and there again you need to decide on a criterion of closeness.

      Comment


      • #4
        Thank you George and Nick for the suggestions. I will follow them.

        The polling firm from where the data came from (known as "Datafolha") conducts polls infrequently every year. Besides, the firm conducts more polls in general election years (i.e., 2002, 2006, 2010, 2014, 2018, 2022).

        For the original variable measuring non-partisanship in Brazil, from a total of 98 time points, 63 observations are measured data and 35 cells contain missing data.

        Comment

        Working...
        X