Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • RMSE Computation Using Rangestat in Panel Data

    Hello Statalist community,

    I am working with a panel dataset for U.S. banks and estimating the following model for different profitability measures:

    〖profit〗_(i,t)^j=β_t 〖profit〗_(i,t-1)^j+γ_t 〖log⁡(assets〗_(i,t-1))+ϵ_(i,t)^j , from t-5 to t.
    for each bank from t-5 to t, for each profitability variable j. The profitability measures include:
    • Net interest income
    • Net operating income (net interest income + net other operating income)
    • Profit before tax
    • Net profit
    • ROAA
    I want to compute the root mean square error (RMSE) of the rolling regression as a measure of profitability uncertainty. I have implemented the following Stata code using rangestat, but since I am new to rangestat, I would like to confirm if this is the correct approach. Stata Code Used:


    foreach var in NetInterestIncome NetOperatingIncome PretaxProfit NetProfit ROAA_w {

    * Run rolling regression for each bank
    rangestat (reg) `var' L_`var' L_Assets, interval(year -5 0) by(FitchID)

    * Compute residuals
    gen residuals_`var' = `var' - b_cons - b_L_`var'*L_`var' - b_L_Assets*L_Assets

    * Compute squared residuals
    gen sq_res_`var' = residuals_`var'^2

    * Compute RMSE for each bank
    bysort FitchID (year): egen mean_res_`var' = mean(sq_res_`var')
    gen RMSE_`var' = sqrt(mean_res_`var')

    * Drop intermediate variables to avoid conflicts in the next iteration
    drop mean_res_`var' sq_res_`var' residuals_`var'
    drop reg_nobs reg_r2 reg_adj_r2 b_L_`var' b_L_Assets b_cons se_L_`var' se_L_Assets se_cons
    }


    Question:

    Is this the correct way to compute RMSE for a rolling regression window of t-5 to t using rangestat?Are there any improvements or alternative approaches to ensure correct RMSE estimation?

    (Attaching a snapshot of my dataset for reference.)

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long FitchID int year double NetInterestIncome float NetOperatingIncome double PretaxProfit float NetProfit double ROAA_w
    856 1994  6.812e+09  6.123e+09  2.115e+09  1.863e+09  .85
    856 1995  7.592e+09  7.242e+09  3.804e+09  2.356e+09 1.06
    856 1996  8.376e+09  8.148e+09  4.383e+09   2.75e+09 1.17
    856 1997  8.538e+09   7.70e+09  4.181e+09  2.634e+09 1.01
    856 1998  9.329e+09  7.517e+09  2.684e+09   1.70e+09  .56
    856 1999 1.0232e+10 9709000704  4.890e+09  3.079e+09  .98
    856 2000 1.0898e+10 1.2263e+10  7.755e+09  4.923e+09 1.32
    856 2001 1.3822e+10 1.3471e+10  8.083e+09   5.27e+09 1.17
    856 2002 2.0715e+10 2.1064e+10  9.682e+09  6.356e+09 1.31
    856 2003 2.0471e+10 2.1598e+10 1.1143e+10  7.919e+09 1.43
    856 2004 2.2416e+10 2.4038e+10 1.3103e+10  9.413e+09 1.39
    856 2005 2.1173e+10 2.4507e+10 1.2439e+10   8.83e+09 1.25
    856 2006 2.3896e+10  2.486e+10 1.2487e+10  9.338e+09  .94
    856 2007 3.0803e+10 2.8686e+10  4.070e+08  2.304e+09  .18
    856 2008 3.6235e+10 2.6373e+10 -9.529e+09 -2.101e+09 -.17
    856 2009 3.2182e+10 2.6253e+10 -7.951e+09 -2.794e+09 -.23
    856 2010 3.0282e+10 2.6624e+10  9.859e+09  7.904e+09  .65
    856 2011 4.3306e+10 3.1554e+10 1.3423e+10 1.0509e+10  .81
    856 2012 4.2283e+10 3.4775e+10 1.5251e+10 1.1845e+10  .88
    856 2013 4.0076e+10 3.2493e+10 2.0493e+10   1.43e+10 1.06
    end
    Thanks so much!

  • #2
    I don't think this would usually be regarded as correct. The divisor in the mean part of the RMSE calculation is customarily adjusted for the number of parameters estimated.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      I don't think this would usually be regarded as correct. The divisor in the mean part of the RMSE calculation is customarily adjusted for the number of parameters estimated.
      Thanks Nick for your response. Is there a way to estimate this regression from t-5 to t, without using rangestat? That way, I can ask Stata to calculate residuals rather than calculating them manually. Is there an inbuilt stata command to calculate the RMSE?

      Comment


      • #4
        asreg can help here, see this example
        Code:
        ssc install asreg
        
        webuse grunfeld, clear
        
        * Rolling regression - reporting RMSE and residuals
        bys company: asreg invest mvalue kstock, w(year 10) rmse fit
        
        list _* in 1/20
        
             +----------------------------------------------------------------------------------------------------------+
             |     _rmse   _Nobs         _R2      _adjR2   _b_mvalue    _b_kstock      _b_cons     _fitted   _residuals |
             |----------------------------------------------------------------------------------------------------------|
          1. |         .       .           .           .           .            .            .           .            . |
          2. |         .       .           .           .           .            .            .           .            . |
          3. |         .       .           .           .           .            .            .           .            . |
          4. | 1.1514366       4   .99991081   .99973242   .05436847   -.21134289    150.31552   257.91024   -.21023193 |
          5. | 6.1627708       5   .99494036   .98988073   .05391537   -.23604919    153.03071    337.5661   -6.7661144 |
             |----------------------------------------------------------------------------------------------------------|
          6. | 51.338599       6   .70620365   .51033942   .06200931   -.06825747    113.98605   387.80812    73.391895 |
          7. | 70.062796       7   .57589074   .36383611   .06547084    .18151132     79.66855   423.96115    88.038848 |
          8. | 70.679176       8   .50027518   .30038525   .05170874    .36373662    116.78996   395.00509    52.994906 |
          9. | 70.447323       9   .50721217   .34294957   .05154518    .44281908    111.48473   437.38193    62.218072 |
         10. | 78.132249      10   .46011588   .30586328   .05693085    .46800615    96.771625   440.43891    107.06109 |
             |----------------------------------------------------------------------------------------------------------|
         11. | 72.460089      10    .5614129    .4361023   .08887698    .88729908   -126.96142    538.4174    22.782611 |
         12. | 64.101652      10   .78380282   .72203219   .08872388    1.2292866   -214.23699   715.00895   -26.908975 |
         13. | 74.498067      10   .71488429   .63342265   .12516278    .42876425   -160.50868   607.38183    -38.48181 |
         14. | 78.124366      10   .45098464   .29412311   .11227122    .29657166   -65.906213   573.06062    -43.86061 |
         15. | 52.773721      10   .52009389   .38297786    .0912647    .16681578     85.23875   593.10518   -38.005202 |
             |----------------------------------------------------------------------------------------------------------|
         16. | 41.145014      10   .72287646    .6436983   .10511862    .17118093    38.538037   621.44936    21.450662 |
         17. | 35.654427      10   .88167177    .8478637   .11069522    .16073767    27.814834   756.92772   -1.0276956 |
         18. | 40.827807      10   .91376003   .88912004   .13377585    .20041587   -91.965695   853.56185     37.63816 |
         19. | 60.185051      10   .95105686    .9370731    .1749349    .25069983   -299.34944   1238.1106    66.289422 |
         20. | 86.590202      10   .94840015   .93365733   .17671561    .34521899   -390.57569   1366.4618    120.23817 |
             +----------------------------------------------------------------------------------------------------------+
        More on asreg here https://fintechprofessor.com/2017/12...ions-in-stata/
        Regards
        --------------------------------------------------
        Attaullah Shah, PhD.
        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
        FinTechProfessor.com
        https://asdocx.com
        Check out my asdoc program, which sends outputs to MS Word.
        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

        Comment


        • #5
          Looking at this again, I don't follow what you're trying to do.

          Each data point is the subject of a regression for that data point (bank, year).

          rangestat produces for each observation point coefficient estimates which you are using.

          But you're producing first a kind of composite set of residuals across all the observations for each bank, which is different, as it is based on several overlapping regressions. So, I am now longer clear whether it makes sense to calculate an RMSE at all or what the divisor should be. Note that asreg stores the RMSE from each individual regression, which is not I think what you're trying to do in #1

          Comment


          • #6
            I think it's germane to add a few broad comments on the difference between rangestat and asreg. They have overlapping but on the whole distinct goals and styles.

            rangestat is deliberately a moderately general command that allows programmed additions. It has token functionality on several levels, but doesn't purport to be comprehensive statistically. Together with rangerun, the main feature is that you can write your own Stata or Mata code to do whatever is not supported directly. Also, all code is explicit, so that you have examples to be copied or varied.

            asreg ("as" means Attaullah Shah, the author) is focused on regression and has many more bells and whistles under that heading. But as I understand it it's not programmable otherwise, and its Mata code is compiled, so not visible to the user.

            Comment


            • #7
              asreg is incredibly helpful. I appreciate its simplicity, speed, and the range of options it offers for estimating regressions.

              Comment


              • #8
                Thanks everyone for your responses

                Comment


                • #9
                  Originally posted by Attaullah Shah View Post
                  asreg can help here, see this example
                  Code:
                  ssc install asreg
                  
                  webuse grunfeld, clear
                  
                  * Rolling regression - reporting RMSE and residuals
                  bys company: asreg invest mvalue kstock, w(year 10) rmse fit
                  
                  list _* in 1/20
                  
                  +----------------------------------------------------------------------------------------------------------+
                  | _rmse _Nobs _R2 _adjR2 _b_mvalue _b_kstock _b_cons _fitted _residuals |
                  |----------------------------------------------------------------------------------------------------------|
                  1. | . . . . . . . . . |
                  2. | . . . . . . . . . |
                  3. | . . . . . . . . . |
                  4. | 1.1514366 4 .99991081 .99973242 .05436847 -.21134289 150.31552 257.91024 -.21023193 |
                  5. | 6.1627708 5 .99494036 .98988073 .05391537 -.23604919 153.03071 337.5661 -6.7661144 |
                  |----------------------------------------------------------------------------------------------------------|
                  6. | 51.338599 6 .70620365 .51033942 .06200931 -.06825747 113.98605 387.80812 73.391895 |
                  7. | 70.062796 7 .57589074 .36383611 .06547084 .18151132 79.66855 423.96115 88.038848 |
                  8. | 70.679176 8 .50027518 .30038525 .05170874 .36373662 116.78996 395.00509 52.994906 |
                  9. | 70.447323 9 .50721217 .34294957 .05154518 .44281908 111.48473 437.38193 62.218072 |
                  10. | 78.132249 10 .46011588 .30586328 .05693085 .46800615 96.771625 440.43891 107.06109 |
                  |----------------------------------------------------------------------------------------------------------|
                  11. | 72.460089 10 .5614129 .4361023 .08887698 .88729908 -126.96142 538.4174 22.782611 |
                  12. | 64.101652 10 .78380282 .72203219 .08872388 1.2292866 -214.23699 715.00895 -26.908975 |
                  13. | 74.498067 10 .71488429 .63342265 .12516278 .42876425 -160.50868 607.38183 -38.48181 |
                  14. | 78.124366 10 .45098464 .29412311 .11227122 .29657166 -65.906213 573.06062 -43.86061 |
                  15. | 52.773721 10 .52009389 .38297786 .0912647 .16681578 85.23875 593.10518 -38.005202 |
                  |----------------------------------------------------------------------------------------------------------|
                  16. | 41.145014 10 .72287646 .6436983 .10511862 .17118093 38.538037 621.44936 21.450662 |
                  17. | 35.654427 10 .88167177 .8478637 .11069522 .16073767 27.814834 756.92772 -1.0276956 |
                  18. | 40.827807 10 .91376003 .88912004 .13377585 .20041587 -91.965695 853.56185 37.63816 |
                  19. | 60.185051 10 .95105686 .9370731 .1749349 .25069983 -299.34944 1238.1106 66.289422 |
                  20. | 86.590202 10 .94840015 .93365733 .17671561 .34521899 -390.57569 1366.4618 120.23817 |
                  +----------------------------------------------------------------------------------------------------------+
                  More on asreg here https://fintechprofessor.com/2017/12...ions-in-stata/
                  Thanks so much for your response. I used the following command to estimate RMSE for a rolling regression window of t-5 to t

                  bys FitchID: asreg NetInterestIncome L_NetInterestIncome L_Assets, robust w(year 6) rmse fit

                  For a bank with data from 1994 to 2020, I assumed RMSE values would start from 1999, since the first available window would be 1994–1999. However, the output shows RMSE values starting from 1998 instead.

                  Does asreg assign RMSE to the second-to-last year in the rolling window instead of the last year? If so, is there a way to adjust the RMSE assignment to match the last year of the rolling window?

                  Any clarification would be greatly appreciated!

                  Comment


                  • #10
                    As I understand, you just change your window() call. provided that you have asreg 4.0 or upwards. See the help.

                    Comment


                    • #11
                      For a bank with data from 1994 to 2020, I assumed RMSE values would start from 1999, since the first available window would be 1994–1999. However, the output shows RMSE values starting from 1998 instead.
                      Evidently asreg is willing to show you incomplete windows. You lose 1 observation at the start of each panel because lagged values are not defined earlier than observation 2, but you don't need 6 observations to estimate that model. Turn and turn about, regressions with 2 predictors and 6 observations are a real stretch statistically, and you've a perfect storm in which short-term variability which is of financial interest is compounded with statistical difficulty in capturing it well.

                      Comment

                      Working...
                      X