Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpolation by subset

    Dear all,

    I have the following graph below where most of the fluctuation in the lines comes from missing data. I would like to interpolate the variable "share_emp_high" but only for those regions where I major fluctuations due to data limitations (or to all). Is there a quick way of doing this using the ipolate command? If I would like to interpolate the variable "share_emp_high" for some regions and not all? how should I do it? Thanks!
    The classification for the sectors are low (1), medium(2), and high (3) sectors. And regions go from 1 to 7

    Code:
     * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year float(region tech_intensity) double(Employment Wages ValueAdded OutputINDSTAT4) float(lval_worker share_emp_high share_emp_low share_emp_medium)
    1963 1 1             493500  2275511486.857143  4751685260.642858 11767174291.928572 10.245505         . .34153825         .
    1963 1 3  580833.3333333334      3705797966.75  7585475254.166667     14920308639.25 10.415336  .4019793         .         .
    1963 1 2             370600         2269636574       4595306102.7      10601274844.8 10.526263         .         . .25648242
    1964 1 1 499285.71428571426  2402710237.785714  5054871625.714286 12647022913.142857  10.29206         . .33905375         .
    1964 1 3             589500 3905977840.3333335  8161580496.333333 15954389449.666666 10.472116  .4003162         .         .
    1964 1 2             383800       2445347215.5         4983442634      11524027559.8 10.585746         .         . .26062998
    1965 1 1 509785.71428571426  2526453019.142857  5329570478.642858 13095049298.428572  10.31873         .  .3300255         .
    1965 1 3             630500       4316867634.5      9201080659.25 18249242610.833332  10.50286 .40817365         .         .
    1965 1 2             404400       2645709413.2       5548220500.5      12681155579.2 10.617845         .         . .26180083
    1966 1 1  519928.5714285714  2688631384.357143         5729918409 14094076320.714285  10.34352         .  .3186432         .
    1966 1 3  689166.6666666666  4882070114.833333     10399938510.75      20572018922.5 10.511442  .4223624         .         .
    1966 1 2             422600       2868696469.6       6066350876.6      13797133510.1 10.626655         .         . .25899443
    1967 1 1 519785.71428571426  2830515986.642857  6074288985.428572 14399366081.428572 10.382154         .  .3137693         .
    1967 1 3             714000  5217274645.916667 10906199511.416666        21402144784  10.54316   .431007         .         .
    1967 1 2             422800       2948609987.8       6209665554.2      13894481176.6  10.66303         .         . .25522372
    1968 1 1 521142.85714285716  3021800487.214286  6564508647.642858 15663674925.571428  10.44223         .  .3117052         .
    1968 1 3  723666.6666666666  5611820882.666667     11939819313.75 23426322592.416668  10.59476  .4328384         .         .
    1968 1 2             427100       3172576540.8       6712120436.7      15023067247.3   10.7164         .         . .25545642
    1969 1 1  531571.4285714285  3238962439.928571  7017815837.928572      16766593625.5 10.465424         .  .3100196         .
    1969 1 3  743166.6666666666  6043861186.666667     12763490771.75     24758198770.25  10.61421  .4334248         .         .
    1969 1 2             439900       3441266356.9       7197381770.3      16212003188.2  10.72662         .         .  .2565556
    1970 1 1  515928.5714285714         3324005182  7217895476.285714 17088816142.571428 10.478456         .  .3147803         .
    1970 1 3  698583.3333333334  5930006657.416667 12332770693.583334 23886611075.666668 10.618324  .4262223         .         .
    1970 1 2             424500       3464853746.2       7080681461.5      16148418686.1 10.733733         .         .  .2589975
    1971 1 1  503571.4285714286  3479359498.214286  7662572281.928572  18138305839.92857 10.554976         .  .3214823         .
    1971 1 3  654333.3333333334  5924091415.833333      12882823768.5 25255415335.083332 10.695177  .4177294         .         .
    1971 1 2             408500       3573399941.2       7436380593.9        16932109165 10.807438         .         .  .2607883
    1972 1 1 515142.85714285716  3824123777.285714  8463848270.571428        20500461897 10.614766         .  .3181119         .
    1972 1 3  687833.3333333334  6735715885.916667 14862186563.416666 29161224104.416668  10.76089   .424752         .         .
    1972 1 2             416400       3911410345.3       8216979410.7      18723185854.3  10.85179         .         . .25713605
    1973 1 1  524071.4285714286         4108959658  9479274472.428572 23592577325.285713 10.606277         .  .3087668         .
    1973 1 3  732333.3333333334      7601876972.75 17100836791.416666        33804044961 10.717712  .4314684         .         .
    1973 1 2             440900         4416967153       9680534701.9      22211338599.5 10.850745         .         .  .2597648
    1974 1 1  512071.4285714286  4359603257.357142 10457445146.285715  27107229102.92857 10.557927         .  .3028089         .
    1974 1 3             733000  8170677208.083333     18898116883.75 38130268106.833336  10.67978   .433453         .         .
    1974 1 2             446000       4869801776.7      11599765677.2      29253186671.5  10.87597         .         . .26373813
    1975 1 1 484285.71428571426  4478843594.071428 10549643474.071428 27664631502.357143  10.51608         .  .3109474         .
    1975 1 3  669666.6666666666  8166992842.333333     18584940236.25 38380828292.416664  10.63936  .4299757         .         .
    1975 1 2             403500       4795413961.8      10808909367.2      28943555137.4 10.803363         .         . .25907695
    1976 1 1 498142.85714285716  5039208898.071428 12025089719.214285 30786837580.357143 10.608042         .  .3114737         .
    1976 1 3  685666.6666666666  9146540787.916666     21686430429.25 45047181148.333336  10.71659  .4287267         .         .
    1976 1 2             415500         5401047743      12385451155.6      33252940727.9 10.878743         .         . .25979963
    1977 1 1  503571.4285714286         5473534602 13193986757.714285 33557743803.285713 10.615614         .  .3021269         .
    1977 1 3  727583.3333333334 10463212173.333334 24932087470.166668 52316822920.333336 10.710715  .4365269         .         .
    1977 1 2             435600       6101841689.5      14276281961.7      38387838998.3  10.89564         .         .  .2613462
    1978 1 1 512357.14285714284  5943695500.857142 14626376694.642857      37177093022.5 10.603712         . .29504475         .
    1978 1 3  771083.3333333334     11868902733.75 28217641707.583332 59257989277.916664 10.688796  .4440342         .         .
    1978 1 2             453100       6824498241.8      15874828858.6      42790501972.9 10.842086         .         . .26092106
    1979 1 1 518142.85714285716  6380482050.642858 15912149101.071428  40575451806.64286  10.57242         . .29029897         .
    1979 1 3  800416.6666666666     13181171588.25 32204823394.416668 66903168109.916664 10.673372   .448448         .         .
    1979 1 2             466300       7557380473.7      19004956319.9      52085259649.8 10.867426         .         . .26125306
    1980 1 1             513500  6856981079.357142 17177110856.142857  43825813715.35714 10.520764         . .29451126         .
    1980 1 3  786666.6666666666 14173264680.333334 33676409133.333332  69681703930.66667 10.622858 .45118245         .         .
    1980 1 2             443400       7742061428.6      18509517853.3      57453698922.5 10.797417         .         . .25430632
    1981 1 1             500950  7282709746.785714      18384927078.5  46948506080.78571 10.507384         . .29264373         .
    1981 1 3  778658.3333333334 15480223093.583334 37066956822.166664  76599976332.41667 10.634363  .4548747         .         .
    1981 1 2             432200       8323011655.8        19725468580      63078167599.9  10.81553         .         . .25248152
    1982 1 1 479335.71428571426  7499540972.857142 19255509124.857143  47670695923.92857 10.533024         .  .2986632         .
    1982 1 3  735491.6666666666 15603059037.583334 36675616772.166664     74292724362.75  10.63753  .4582681         .         .
    1982 1 2             390110       7847654261.2      17140097589.9      56030068126.5  10.74045         .         .  .2430687
    1983 1 1             480850  7972022326.928572 20982005335.785713  50578654679.85714 10.641305         .  .3059686         .
    1983 1 3  715416.6666666666 16245149214.166666      39309237273.5  80164366570.08333 10.725178  .4552252         .         .
    1983 1 2             375300       7859768843.5      18024950123.9      56201025728.4  10.84752         .         .  .2388063
    1984 1 1 476814.28571428574         8284774873  22593198445.92857  53807731652.21429  10.68035         .  .2960756         .
    1984 1 3  745033.3333333334 18003248923.166668 45210282192.833336      92951590652.5 10.786222   .462625         .         .
    1984 1 2             388600       8504578997.4      19440906010.9      60036255171.5 10.860976         .         . .24129938
    1985 1 1             464750  8467207772.285714 23323106927.142857        54099408111  10.72432         . .29319453         .
    1985 1 3             739225 18728581171.666668 45694231138.416664        94642140233 10.799863  .4663512         .         .
    1985 1 2             381150       8575483051.4      19582189219.2        58469635857 10.895268         .         . .24045423
    1986 1 1 461864.28571428574  8762734618.357143      25041957740.5        56253706590 10.824414         . .29737276         .
    1986 1 3             720475      18986599345.5 46656734555.166664        95726320704 10.854326  .4638801         .         .
    1986 1 2             370810       8574781824.8      19995132050.9      52013297018.8 10.945887         .         .  .2387472
    1987 1 3             741025 20096779166.333332        52511091291 104787279751.41667 10.909956    .46079         .         .
    1987 1 1 482707.14285714284         9550209317 28215706551.285713  62003679178.92857  10.92263         .  .3001607         .
    1987 1 2             384430       9118445599.9      22594959121.6      57764565805.2  11.02473         .         .  .2390493
    1988 1 1 484735.71428571426 10056428855.214285 29448156291.714287  66567813227.71429  10.95743         . .29681924         .
    1988 1 3             753525 21377143927.166668        57401336751 115598818004.33333  10.97409  .4614075         .         .
    1988 1 2             394840       9805276475.8        25510261668      63340538449.3 11.135797         .         .  .2417732
    1989 1 3             751775 22137882261.166668     60264662953.75       120830321891 10.990957  .4621783         .         .
    1989 1 1 479085.71428571426 10386695894.357143 31320412340.857143  69338334292.14285 10.996034         . .29453364         .
    1989 1 2             395730        10136138733      25660091488.6      66260701131.4 11.054836         .         .   .243288
    1990 1 1 440392.93333333335  9862153595.066668      29950083480.4  66207620101.26667 10.922278         .  .3108025         .
    1990 1 3  588982.9333333333      17962799977.8 48342140743.666664  98087327531.53334 10.979763  .4156683         .         .
    1990 1 2           387578.4      10334152248.9      25620505491.1        68674349701 11.059524         .         . .27352923
    1991 1 1           426271.2  9872087455.466667        30175835678      66010992850.6 10.944402         .  .3157738         .
    1991 1 3             555581      17655017069.2  47963113900.86667        97245801863  11.00156  .4115641         .         .
    1991 1 2           368073.6      10129210481.4        23969402618      64467527391.4 11.033327         .         .  .2726621
    1992 1 1  430436.5333333333      10420958096.2      32554049695.4  69510779875.86667  10.99168         .  .3172982         .
    1992 1 3           555464.6 18433920847.066666 51239957422.666664     103947657831.6 11.018333  .4094632         .         .
    1992 1 2           370666.7      10624613700.1      26060811953.6      66063486953.1 11.059158         .         .  .2732386
    1993 1 1           432187.4 10612008733.333334 33341081663.133335      71552465314.4  10.96999         .  .3187601         .
    1993 1 3           550349.6      18592135422.6 53927033765.933334 109344155490.26666 11.060923  .4059107         .         .
    1993 1 2           373302.1      10876632261.3      27301119640.8      67787471139.2 11.078773         .         .  .2753292
    1994 1 1  431558.6666666667      10813339503.8  34820166132.46667  74347917452.06667  11.02734         .  .3153401         .
    1994 1 3  553736.5333333333      19266811081.6  58304387751.13333 118838937144.73334 11.111186  .4046155         .         .
    1994 1 2           383254.8      11373193616.7      30539121918.6      72665876687.1 11.160436         .         . .28004444
    1995 1 1 434118.13333333336 11112644126.466667  37288752275.13333      79296774352.8  11.05942         .  .3100677         .
    1995 1 3  567631.6666666666 20119234687.866665 62583887055.933334 128468835971.73334 11.130182  .4054294         .         .
    1995 1 2           398325.5      11988967154.9      32842911788.1      78944094596.2 11.193743         .         . .28450292
    1996 1 1  434753.6666666667      11347097673.8  38132901415.26667  81413029748.93333  11.03524         .  .3097191         .
    end

    Attached Files

  • #2
    Code:
    h ipolate
    shows that the command allows the -if- qualifier. You have to define the -if- conditions yourself, naturally.


    ipolate yvar xvar [if] [in] , generate(newvar) [epolate]

    Comment


    • #3
      That means that I can use " ipolate share_emp_high year if region==3" and have separate variables interpolated by region?

      Comment


      • #4
        You can specify interpolation by region at the same time place a restriction of which regions should be included in the interpolation, e.g.,

        Code:
        bys region: ipolate share_emp_high year if region==3, g(ishare_emp_high)
        However, if you wanted the interpolated variable to have the same values as the original variable for the restricted regions, after interpolation, you need

        Code:
        replace ishare_emp_high= share_emp_high if region!= 3

        Comment


        • #5
          Thank you very much!

          Comment


          • #6
            I am not following this. If there are gaps in the data then by default a line graph just jumps over them and joins known points with straight line segments. Interpolation with ipolate will give you numeric equivalents for observations with missing values, but it won't improve the graph.

            Here is a trivial example that can be run.

            Code:
             
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(t y)
            1 1
            2 .
            3 .
            4 4
            end
            
            line y t
            
            ipolate y t, gen(Y)
            
            line Y t
            The graph is just the same.

            The main historical use of interpolation was in "reading between the lines", especially going beyond published tables of logarithmic, trigonometric, statistical and other functions where the underlying behaviour is smooth.

            Its uses for data seem relatively straightforward to fill very small gaps for data varying very smoothly , but otherwise it seems a distraction all too likely to give researchers and their readers the impression that you somehow solved a problem. When the interpolated values are just deterministic functions of the known values, you don't make the sample larger in any real sense, or increase how many degrees of freedom can be claimed.

            I am conflicted here because i have found interpolation a congenial programming problem, but consider its use for data series to be often problematic.

            Comment

            Working...
            X