Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    That's now really quite a different topic. You need to attract more people than are intrigued by interpolation.

    I'd start a new thread on fixed and random effects. I have never run a Hausman test in my life, so will pass on that. Expanding the question to include the gist of this thread would be a good idea.

    Comment


    • #17
      thanks for the input, just did that ! Cheers

      Comment


      • #18
        back to the interpolation, is it then safe to say that using is interpolation on logarithmic scale followed by back-transformation is an advisable method given the observed process (deforestation)?
        I was comparing it to the linear ipolate and there are some differences. is the advantage a more realistic fit than straight line ?

        Comment


        • #19
          Hi Oded and Nick !
          I have a quinquennial data (occurring after every 5-years) on GP-index across 88 countries over 1970-2015 . it is a panel data. Some countries have zero-value of index for some years , while others have missing data in some years (usually missing data from 1970-1990). I want to interpolate the data between two years 1970 and 1975 (say) to get the values for 1971, 1972,1973,1974 , similarly for other years down the line.
          I have tried the following two interpolation methods :
          (1)
          bysort country: ipolate GP Year, generate(GP1) epolate

          This code works fine but it gives negative values for some country-year observations (The index in my case cannot have negative value , it lies in the range of [0,5]. I can understand that linear interpolation may sometimes generate negative values. This occurs when the trend in the data is towards zero-values, the linear interpolation extends this trend beyond zero to a negative range. I am stuck in how to address this issue.
          I also tried using nearest neighbour interpolation for the country-year groups that gave me negative values initially and linear interpolation of other country-year groups . Then I append the two files. But i am less sure whether that is right approach
          (2) I also tried using "mipolate" as suggested by Nick above:

          gen logGP = log(GP)
          ssc inst mipolate
          mipolate logGP Year, by(country) gen(loglin_GP)


          However, This does not do any interpolation. It only changes the scale of variable GP to ln(GP)

          (3) Interestingly I also tried other interpolations like :

          bysort ISO_Code: ipolate GP Year, generate(GP1) epolate

          bysort ISO_Code: mipolate GP Year, generate(GP1) spline epolate

          ssc inst csipolate
          bysort ISO_Code: csipolate GP Year, generate(GP1)

          ssc inst pchipolate
          bysort ISO_Code: pchipolate GP Year, generate(GP1)


          However all of these gave me negative values for the index.

          Please help me as how should I proceed with the interpolation keeping in view the kind of data I had.
          I shall be very thankful

          Comment


          • #20
            I have done the back- transformation also after interpolation viz ; see (method-2)

            replace loglin_GP = exp(loglin_GP)

            But it was not useful in my case as it produce same values as in original GP without actually interpolating anything

            Comment


            • #21
              Ridwan Sheikh

              It is hard to follow your question -- partly because there is no data example.

              (2) in #10 is likely to be confusing. With your code mipolate does not "change the scale". It works with the data you give it, which is on log scale. Whether that makes any difference to the data depends on whether there are gaps.

              (3) just repeats (1) insofar as the epolste option ofmipolate is the same code as that option in ipolate. I am not surprised that cubic splines can produce negative values with your data, but the so-called pchip method should not produce values outside the range of the data.

              If the data are from [0.5] then interpolating on logarithmic scale is unlikely to be a serious method, and not only because zeros are present. The main reason for working on logarithmic scale is whenever, to a good first approximation, the mode of change is exponential, whether growth or decline. I don't know what GP index is, but if it is defined as bounded then logarithms are unlikely to be useful -- unless the variable is very skewed. If it is, then interpolation using cube roots or log(GP + 1) might help.

              Despite being the author of
              mipolate I don't endorse all the uses of this command, or of ipolate. Interpolation works best when you are filling in a small gap in a well-behaved series. When used to bulk out seriously gappy or problematic data it necessarily can't add new information and there is a danger that modelling on such data is misleading as the modelled data are smoother than they deserve to be and the number of degrees of freedom that commands use is far more than is justified. That can have consequences all the way to all inferences, including measures of misfit and P-values.

              In this case your dataset is 4/5 gaps, so those problems are serious. I guess the need arises because elsewhere you have annual data.


              Comment


              • #22
                Thanks Nick Cox for getting back to me,

                GP index is Ginarte-Park index of intellectual property protection.
                I am here posting the data for some countries and years for which the interpolation methods were generating negative values (Values outside the range of [0,5] )

                The Sample data in my case looks like this :

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str23 country str3 ISO_Code int Year double GP
                "Angola"   "AGO" 1970                  0
                "Angola"   "AGO" 1971                  .
                "Angola"   "AGO" 1972                  .
                "Angola"   "AGO" 1973                  .
                "Angola"   "AGO" 1974                  .
                "Angola"   "AGO" 1975                  0
                "Angola"   "AGO" 1976                  .
                "Angola"   "AGO" 1977                  .
                "Angola"   "AGO" 1978                  .
                "Angola"   "AGO" 1979                  .
                "Angola"   "AGO" 1980                  0
                "Angola"   "AGO" 1981                  .
                "Angola"   "AGO" 1982                  .
                "Angola"   "AGO" 1983                  .
                "Angola"   "AGO" 1984                  .
                "Angola"   "AGO" 1985                  0
                "Angola"   "AGO" 1986                  .
                "Angola"   "AGO" 1987                  .
                "Angola"   "AGO" 1988                  .
                "Angola"   "AGO" 1989                  .
                "Angola"   "AGO" 1990                  0
                "Angola"   "AGO" 1991                  .
                "Angola"   "AGO" 1992                  .
                "Angola"   "AGO" 1993                  .
                "Angola"   "AGO" 1994                  .
                "Angola"   "AGO" 1995               .875
                "Angola"   "AGO" 1996                  .
                "Angola"   "AGO" 1997                  .
                "Angola"   "AGO" 1998                  .
                "Angola"   "AGO" 1999                  .
                "Angola"   "AGO" 2000              1.075
                "Angola"   "AGO" 2001                  .
                "Angola"   "AGO" 2002                  .
                "Angola"   "AGO" 2003                  .
                "Angola"   "AGO" 2004                  .
                "Angola"   "AGO" 2005                1.2
                "Angola"   "AGO" 2006                  .
                "Angola"   "AGO" 2007                  .
                "Angola"   "AGO" 2008                  .
                "Angola"   "AGO" 2009                  .
                "Angola"   "AGO" 2010                1.6
                "Angola"   "AGO" 2011                  .
                "Angola"   "AGO" 2012                  .
                "Angola"   "AGO" 2013                  .
                "Angola"   "AGO" 2014                  .
                "Angola"   "AGO" 2015                1.6
                "Angola"   "AGO" 2016                  .
                "Angola"   "AGO" 2017                  .
                "Angola"   "AGO" 2018                  .
                "Angola"   "AGO" 2019                  .
                "Bulgaria" "BGR" 1970                  .
                "Bulgaria" "BGR" 1971                  .
                "Bulgaria" "BGR" 1972                  .
                "Bulgaria" "BGR" 1973                  .
                "Bulgaria" "BGR" 1974                  .
                "Bulgaria" "BGR" 1975                  .
                "Bulgaria" "BGR" 1976                  .
                "Bulgaria" "BGR" 1977                  .
                "Bulgaria" "BGR" 1978                  .
                "Bulgaria" "BGR" 1979                  .
                "Bulgaria" "BGR" 1980                  .
                "Bulgaria" "BGR" 1981                  .
                "Bulgaria" "BGR" 1982                  .
                "Bulgaria" "BGR" 1983                  .
                "Bulgaria" "BGR" 1984                  .
                "Bulgaria" "BGR" 1985                  .
                "Bulgaria" "BGR" 1986                  .
                "Bulgaria" "BGR" 1987                  .
                "Bulgaria" "BGR" 1988                  .
                "Bulgaria" "BGR" 1989                  .
                "Bulgaria" "BGR" 1990 1.6166666666666665
                "Bulgaria" "BGR" 1991                  .
                "Bulgaria" "BGR" 1992                  .
                "Bulgaria" "BGR" 1993                  .
                "Bulgaria" "BGR" 1994                  .
                "Bulgaria" "BGR" 1995 2.7666666666666666
                "Bulgaria" "BGR" 1996                  .
                "Bulgaria" "BGR" 1997                  .
                "Bulgaria" "BGR" 1998                  .
                "Bulgaria" "BGR" 1999                  .
                "Bulgaria" "BGR" 2000                3.5
                "Bulgaria" "BGR" 2001                  .
                "Bulgaria" "BGR" 2002                  .
                "Bulgaria" "BGR" 2003                  .
                "Bulgaria" "BGR" 2004                  .
                "Bulgaria" "BGR" 2005              3.625
                "Bulgaria" "BGR" 2006                  .
                "Bulgaria" "BGR" 2007                  .
                "Bulgaria" "BGR" 2008                  .
                "Bulgaria" "BGR" 2009                  .
                "Bulgaria" "BGR" 2010              3.875
                "Bulgaria" "BGR" 2011                  .
                "Bulgaria" "BGR" 2012                  .
                "Bulgaria" "BGR" 2013                  .
                "Bulgaria" "BGR" 2014                  .
                "Bulgaria" "BGR" 2015  4.541666666666666
                "Bulgaria" "BGR" 2016                  .
                "Bulgaria" "BGR" 2017                  .
                "Bulgaria" "BGR" 2018                  .
                "Bulgaria" "BGR" 2019                  .
                "Russian Federation" "RUS" 1970                  .
                "Russian Federation" "RUS" 1971                  .
                "Russian Federation" "RUS" 1972                  .
                "Russian Federation" "RUS" 1973                  .
                "Russian Federation" "RUS" 1974                  .
                "Russian Federation" "RUS" 1975                  .
                "Russian Federation" "RUS" 1976                  .
                "Russian Federation" "RUS" 1977                  .
                "Russian Federation" "RUS" 1978                  .
                "Russian Federation" "RUS" 1979                  .
                "Russian Federation" "RUS" 1980 1.0833333333333333
                "Russian Federation" "RUS" 1981                  .
                "Russian Federation" "RUS" 1982                  .
                "Russian Federation" "RUS" 1983                  .
                "Russian Federation" "RUS" 1984                  .
                "Russian Federation" "RUS" 1985 1.2833333333333332
                "Russian Federation" "RUS" 1986                  .
                "Russian Federation" "RUS" 1987                  .
                "Russian Federation" "RUS" 1988                  .
                "Russian Federation" "RUS" 1989                  .
                "Russian Federation" "RUS" 1990 1.2833333333333332
                "Russian Federation" "RUS" 1991                  .
                "Russian Federation" "RUS" 1992                  .
                "Russian Federation" "RUS" 1993                  .
                "Russian Federation" "RUS" 1994                  .
                "Russian Federation" "RUS" 1995 3.4749999999999996
                "Russian Federation" "RUS" 1996                  .
                "Russian Federation" "RUS" 1997                  .
                "Russian Federation" "RUS" 1998                  .
                "Russian Federation" "RUS" 1999                  .
                "Russian Federation" "RUS" 2000              3.675
                "Russian Federation" "RUS" 2001                  .
                "Russian Federation" "RUS" 2002                  .
                "Russian Federation" "RUS" 2003                  .
                "Russian Federation" "RUS" 2004                  .
                "Russian Federation" "RUS" 2005              3.675
                "Russian Federation" "RUS" 2006                  .
                "Russian Federation" "RUS" 2007                  .
                "Russian Federation" "RUS" 2008                  .
                "Russian Federation" "RUS" 2009                  .
                "Russian Federation" "RUS" 2010              3.675
                "Russian Federation" "RUS" 2011                  .
                "Russian Federation" "RUS" 2012                  .
                "Russian Federation" "RUS" 2013                  .
                "Russian Federation" "RUS" 2014                  .
                "Russian Federation" "RUS" 2015                3.8
                "Russian Federation" "RUS" 2016                  .
                "Russian Federation" "RUS" 2017                  .
                "Russian Federation" "RUS" 2018                  .
                "Russian Federation" "RUS" 2019                  .
                end
                Interpolation methods I have tried are the following :
                (1) Linear Interpolation

                Code:
                 bysort ISO_Code: ipolate GP Year, generate(GP1) epolate
                (2) Spline Interpolation

                Code:
                 
                bysort ISO_Code: mipolate GP Year, generate(GP1) spline epolate
                (3) Cubic Spline Interpolation

                Code:
                ssc inst csipolate
                bysort ISO_Code: csipolate GP Year, generate(GP1)
                (4) PCHIPOLATE

                Code:
                ssc inst pchipolate
                bysort ISO_Code: pchipolate GP Year, generate(GP1)
                All of the methods from 1-4 gave me negative values .

                I also tried what you have suggested above (interpolate on logarithmic scale and the do back-transformation) :

                Code:
                gen logGP = log(GP)
                ssc inst mipolate 
                mipolate logGP Year, by(country) gen(loglin_GP)
                replace loglin_GP = exp(loglin_GP)
                But this does not interpolate for the missing values in GP index (back and forth) from 1970-1990 or 2015 onwards for some country-year pairs.

                However, I have slightly modified this code by specifying
                Code:
                 epolate
                :
                Code:
                gen logGP = log(GP)
                ssc inst mipolate 
                mipolate logGP Year, by(country) gen(loglin_GP) epolate
                replace loglin_GP = exp(loglin_GP)
                In this way I got the interpolation for all the country-year pairs strictly in the range of [0,5] with no negative values .

                I want to ask, whether this modification is allowed ? or any other suggestions is should proceed with ?

                Thanks.




                Comment


                • #23
                  Thanks for the data example. I find that the nonlinear methods I tried all produced some improvements but also some artefacts that were dubious. When there are double bounds that are attainable you need to be very careful about transformations.

                  I'd recommend the simplest method that can be explained most easily: linear interpolation with truncation. The function clip() makes this easy.

                  Code:
                  mipolate GP Year, by(country) epolate gen(linear)
                  scatter linear  GP Year, ms(Oh O) mc(orange_red blue) by(country, col(1) yrescale)
                  gen linear2 = clip(linear, 0, 5)
                  scatter linear2  GP Year, ms(Oh O) by(country, col(1) yrescale) mc(orange_red blue)

                  Comment


                  • #24
                    Thanks Nick Cox !

                    Code:
                     mipolate GP Year, by(country) epolate gen(linear) \\
                    gen linear2 = clip(linear, 0, 5)
                    This Code works fine with the sample data provided above, but when I do it for whole data in my case (88 countries) across 1970-2019, it does not work well in a manner it worked with sample data. Any further suggestions on how to improve on that ?

                    Comment


                    • #25
                      The image as how this code works with whole data is shown as :

                      Click image for larger version

Name:	Capture1.PNG
Views:	1
Size:	24.3 KB
ID:	1623400


                      However when I run this code on the sample data as I provide to you above, the interpolation works fine, here is the image :

                      Click image for larger version

Name:	Capture2.PNG
Views:	1
Size:	33.2 KB
ID:	1623401


                      Please help me how can I modify this code of linear interpolation with truncation so that it works exactly in the manner with full data as it worked with sample data .
                      Thanks and regards,

                      Comment


                      • #26
                        Sorry, but I don’t have any specific reactions to add.

                        Comment


                        • #27
                          Thank you very much Nick Cox !
                          The Code worked fine, I was doing some minor error before.

                          Comment

                          Working...
                          X