Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • FIlling in missing data

    Hello everyone. I have a monthly panel data set ranginh over 10 years for 36 banks, and 18 variables. However, for some of the banks data are being reported only every quarter. I have a few questions that I would like to and i hope i won't be too much bother.

    First i would like to know if it is possible to fill in the missing data for these banks using interpolation and if yes, how should i decide which method is best for my data? e.g is linear interpolation the best or should i use some other method?

    Secondly, i have used this command ipolate
    Code:
    Lending_HH month, gen(new_lend_hh) epolate by(bank_id)
    and it seemed to work, but i would like to ensure that this is the correct command for linear interpolation.

    Lastly, i would like to know if for example 10 out of the 36 banks that i have report only quarterly data and i use interpolation in order to fill in the monthly data, if this would cause a problem for my analysis and if it would be better to just let the missing values be or to remove these banks completely. I would like to note that i do not know the reason why for these banks there are only quarterly data available while for the rest of the banks there are monthly data available.

    Here is a sample of my data from a bank that has only quarterly data availabe:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte bank_id str48 banks long(month Balances_at_Riks Treasury_bills Lending_MFI Lending_to_Riks Lending_to_banks Lending_NFC Lend_Entr_HH Lending_HH Assets Dep_Riks Dep_NFC Dep_Entr_HH Dep_HH Dep_other_HH Debt_securities)
    4 "Falkenbergs Sparbank" 612  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 613  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 614 22  0 191 0 191 1249 1336 3553 5996 0 1058 1026 3681 2655 16
    4 "Falkenbergs Sparbank" 615  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 616  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 617 21  0 325 0 325 1262 1364 3600 6178 0 1109 1049 3771 2722 15
    4 "Falkenbergs Sparbank" 618  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 619  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 620 26  0 493 0 493 1246 1371 3608 6246 0 1141 1053 3877 2823 14
    4 "Falkenbergs Sparbank" 621  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 622  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 623 24  0 312 0 312 1312 1391 3645 6515 0 1209 1135 4020 2886 13
    4 "Falkenbergs Sparbank" 624  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 625  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 626 24  0 265 0 265 1372 1381 3659 6594 0 1174 1086 4024 2938 13
    4 "Falkenbergs Sparbank" 627  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 628  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 629 22  0 453 0 453 1423 1409 3714 6868 0 1211 1123 4207 3084 12
    4 "Falkenbergs Sparbank" 630  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 631  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 632 24  0 453 0 453 1404 1418 3747 7082 0 1207 1126 4291 3164 12
    4 "Falkenbergs Sparbank" 633  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 634  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 635 28  0 467 0 467 1386 1429 3743 7066 0 1199 1167 4320 3153 15
    4 "Falkenbergs Sparbank" 636  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 637  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 638 27  0 390 0 390 1400 1453 3780 7167 0 1173 1130 4290 3160 12
    4 "Falkenbergs Sparbank" 639  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 640  .  .   . .   .    .    .    .    . .    .    .    .    .  .
    4 "Falkenbergs Sparbank" 641 16 24 436 0 436 1434 1471 3827 7367 0 1221 1102 4364 3262 12
    end
    format %tm month
    Thank you for your help in advance.

  • #2
    Vasilis:
    I do share your last statement: it's better to leave things as they are instead of making-up data.
    Stata can handle both balanced and unbalanced panels with no problems at all.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Dear Carlo Lazzaro, i am sorry for my late reply. Thank you very much for your advice.

      I would also like to ask one more question. It was suggested to me to transform the monthly data into trimester data so that all banks can have the same frequency. Any help is much appreciated.

      Comment


      • #4
        Vasilis:
        provided that your research field is far from mine, I think that trasforming the -timevar- into quarters may make sense if monthly variations are negligible.
        Before collapsing data from monthly to quaterly, I would also consider which estimator you want to use -fe- (that focuses on within panel variation) or -re- (that focuses more on both between and within panel variation).
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Dear Carlo Lazzaro thank you again for your answer and your advice!

          Comment


          • #6
            I have managed to transform manually the data into quarterly data for all banks in excel but when i import them into stata it appears as stata is reading them as monthly data when i xtset my data. So my data now look like this:

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input byte bankid int months long(Balances_at_CB Treasury_bills EL lending_HH deposits_HH total_assets)
            1 614  509 19229 21211  97441 27396 642533
            1 617  349 23495 26952  99486 28549 578695
            1 620  500 44850 45350 100663 29242 667647
            1 623  323 18310 18633 101346 30473 619463
            1 626 1004 18074 21235 102166 31127 683493
            1 629  262 27749 28011 103017 31908 656213
            1 632  452 32882 33334 104347 31883 725778
            1 635  272 33461 47363 104284 32299 627173
            1 638  310 30955 31265 103839 32479 672088
            1 641   70 24721 29409 102950 33525 639705
            1 644  238 12022 12260 101329 33002 682424
            1 647   85 21492 29649 100068 33315 655350
            1 650  102 33910 35238  98459 31525 745059
            1 653 1057 15485 16542  97133 31107 687975
            1 656 1147 14117 15264  97105 30351 745496
            1 659  666 19200 23668  96541 30073 765074
            end
            format %tm months
            [/CODE]

            Even though my months variable is every third month, March, June, September ... when i xtset my data i get this result

            Code:
            . xtset bankid months
            
            Panel variable: bankid (strongly balanced)
             Time variable: months, 2011m3 to 2014m12, but with gaps
                     Delta: 1 month
            And if i use xtset bankid months, quarterly the months become a total mess. Is there a way to register these dates as quarterly?

            Comment


            • #7
              Vasilis:
              see -help datetime-.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Dear Carlo Lazzaro thank you again very much for your help! I used

                Code:
                // Formating date
                gen quorter = qofd(dofm(months))
                format quorter %tq
                and it seems to have produced the desired results.

                Comment


                • #9
                  Dear Carlo Lazzaro i saw that you have responded in many threads about dynamic panel data and i would like to ask for your advice too, if it is possible. I am still working on the data that i have posted earlier in this thread, but i have transformed them into quarterly data, and i have 36 quorters from 2011 to 2019 and 32 banks, so it is a long panel. I am using the lagged value of the dependent variable Lending_ratio as an regressor, and i am also lagging all the other regressors one time period. From the posts i have read on the forum i understand that i should use
                  Code:
                  xtregar, fe
                  since T>N, is that correct? And if i want to include time fixed effects do i just have to include i.quorter ?

                  Also, if it is possible, could you tell me what are the tests for autocorrelation and heteroskedasticity in a dynamic panel with T>N? I have read through http://fmwww.bc.edu/EC-C/S2013/823/E...n05.slides.pdf which i saw you had posted in a similar question but from what i understood this is for the case where N>T?

                  Comment


                  • #10
                    Vasilis:
                    sorry to let you down, but I'm pretty familiar with static panel datasets only.
                    That said, I think you have to consider -xtabond-.
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #11
                      Dear Carlo Lazzaro thank you very much for your immediate reply. I will have a look at xtabond

                      Comment

                      Working...
                      X