Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    No need to adjust the Mata code. rangestat returns first the number of observations (as myreg1). Then as many variables as there are independent variables. So if you have 2, rangestat will store the coefficients in myreg2 myreg3. Finally, since a constant is added in the Mata code, it's coefficient is returned in myreg4. Here's an updated example:

    Code:
    * test data; use variable names in the thread
    webuse grunfeld, clear
    rename company permno
    rename invest stockexcessret
    rename mvalue mktrf
    gen mktrf_lag1 = L.mktrf
    save "test_data.dta", replace
    
    * ------------ regressions over a window of 6 periods using -rangestat- --------
    * define a linear regression in Mata using quadcross() - help mata cross(), example 2
    mata:
    mata clear
    mata set matastrict on
    real rowvector myreg(real matrix Xall)
    {
        real colvector y, b, Xy
        real matrix X, XX
    
        y = Xall[.,1]                // dependent var is first column of Xall
        X = Xall[.,2::cols(Xall)]    // the remaining cols are the independent variables
        X = X,J(rows(X),1,1)         // add a constant
        
        XX = quadcross(X, X)        // linear regression, see help mata cross(), example 2
        Xy = quadcross(X, y)
        b  = invsym(XX) * Xy
        
        return(rows(X), b')
    }
    end
    
    * regressions with a constant over a rolling window of 6 periods by permno
    rangestat (myreg) stockexcessret mktrf mktrf_lag1, by(permno) interval(time -5 0) casewise
    
    * the Mata function returns first the number of observations and then as many
    * variables as there are independent variables (plus the constant) for the betas
    rename (myreg1 myreg2 myreg3 myreg4) (nobs rs_mktrf rs_mktrfl1 rs_cons)
    
    * reject results if the window is less than 6 or if the number of obs < 4
    isid permno year
    by permno: replace rs_mktrf = . if _n < 6 | nobs < 4
    by permno: replace rs_cons = . if _n < 6 | nobs < 4
    by permno: replace rs_mktrfl1 = . if _n < 6 | nobs < 4
    save "rangestat_results.dta", replace
    
    * ----------------- replicate using -rolling- ----------------------------------
    use "test_data.dta", clear
    levelsof permno, local(permno)
    foreach s of local permno {
        rolling _b, window(6) saving(betas_`s', replace) reject(e(N) < 4):  ///
           regress stockexcessret mktrf mktrf_lag1 if permno == `s'                
    }
    
    clear
    save "betas.dta", replace emptyok
    foreach s of local permno {
        append using "betas_`s'.dta"
    }
    rename end year
    merge 1:1 permno year using "rangestat_results.dta"
    isid permno year, sort
    
    gen diff_mktrf =  abs(_b_mktrf - float(rs_mktrf))
    gen diff_mktrfl1 =  abs(_b_mktrf_lag1 - float(rs_mktrfl1))
    gen diff_cons =  abs(_b_cons - float(rs_cons))
    summ diff*

    Comment


    • #17
      Thank you very much indeed, Robert!

      When tailoring the code to my dataset, however, I get somewhat odd results. My task is very similar to the one mentioned in #1. More specifically, I'm trying to extract monthly betas from rolling regressions using daily data.

      I have a panel dataset of daily stock and market excess returns where permno is the panel variable (indicating the different stocks) and D is the time variable (indicating the trading date). I created this time variable D from the original time variable date to account for that not all days are trading days.
      Code:
      egen D = group(date)
      tsset permno D
      Now I would like to run rolling regressions of excess stock returns (ret_rf) on excess market returns (CRSP_MKT_RF) and five lagged excess market returns (CRSP_MKT_RF1 through CRSP_MKT_RF5) to collect betas for each stock (permno).
      Specifically, I want to use overlapping rolling windows of one year, i.e. 250 trading days. Furthermore, if there are less than 200 daily observations for the rolling regression, betas should be set to missing.

      Code:
      * define a linear regression in Mata using quadcross()
      mata:
      mata clear
      mata set matastrict on
      real rowvector myreg(real matrix Xall)
      {
          real colvector y, b, Xy
          real matrix X, XX
      
          y = Xall[.,1]              
          X = Xall[.,2::cols(Xall)]  
          X = X,J(rows(X),1,1)        
          
          XX = quadcross(X, X)      
          Xy = quadcross(X, y)
          b  = invsym(XX) * Xy
          
          return(rows(X), b')
      }
      end
      
      * regressions with a constant over a rolling window of 12 months, i.e. 250 trading days by permno
      rangestat (myreg) ret_rf CRSP_MKT_RF CRSP_MKT_RF_1 CRSP_MKT_RF_2 CRSP_MKT_RF_3 CRSP_MKT_RF_4 CRSP_MKT_RF_5, by(permno) interval(D -249 0)
      
      rename (myreg1 myreg2 myreg3 myreg4 myreg5 myreg6 myreg7 myreg8) (nobs b_CRSP_MKT_RF b_CRSP_MKT_RF_1 b_CRSP_MKT_RF_2 b_CRSP_MKT_RF_3 b_CRSP_MKT_RF_4 b_CRSP_MKT_RF_5 b_cons)
      
      // Reject results if the number of obs < 200 days
      isid permno date
      by permno: replace b_CRSP_MKT_RF = .     if nobs < 200
      by permno: replace b_CRSP_MKT_RF_1 = .     if nobs < 200
      by permno: replace b_CRSP_MKT_RF_2 = .     if nobs < 200
      by permno: replace b_CRSP_MKT_RF_3 = .     if nobs < 200
      by permno: replace b_CRSP_MKT_RF_4 = .     if nobs < 200
      by permno: replace b_CRSP_MKT_RF_5 = .    if nobs < 200
      by permno: replace b_cons = .             if nobs < 200
      
      // Drop missing beta values
      drop if b_CRSP_MKT_RF == . | b_CRSP_MKT_RF_1 == . | b_CRSP_MKT_RF_2 == . | b_CRSP_MKT_RF_3 == . |  b_CRSP_MKT_RF_4 == . |  b_CRSP_MKT_RF_5 == .
      The actual beta variable of interest is then calculated as the sum of the regression coefficients. Finally, daily betas (beta) are averaged monthly to obtain monthly betas (mbeta).
      Code:
      // Compute ex-ante betas as the sum of slopes
      gen beta =     b_CRSP_MKT_RF + b_CRSP_MKT_RF_1 + b_CRSP_MKT_RF_2 + b_CRSP_MKT_RF_3 + b_CRSP_MKT_RF_4 + b_CRSP_MKT_RF_5
      
      // Create new monthly date variable
      gen int mdate = mofd(date)
      format mdate %tm
      
      // Use daily betas to compute monthly time-t betas for each stock
      quietly bysort permno mdate: egen mbeta = mean(beta)
      duplicates drop permno mdate mbeta, force // I only need one monthly beta per stock (permno)
      tsset permno mdate
      The obtained monthly beta estimates (mbeta) vary strongly from month to month for each stock. This, however, is somewhat implausible given that I use overlapping rolling windows. Betas should vary rather smoothly from month to month.

      Does anybody spot an obvious mistake in the above-mentioned code?
      Thank you very much in advance.

      Best,
      Christopher

      Comment


      • #18
        The neat thing with rangestat is that results are computed for each observation. You can manually check the results for any specific observation by simply specifying the desired calculation using the corresponding Stata command. For example, the following shows the results for observations 30 and 31, as computed by rangestat, and then how to replicate the results:

        Code:
        * test data; use variable names in the thread
        webuse grunfeld, clear
        rename company permno
        rename invest stockexcessret
        rename mvalue mktrf
        gen mktrf_lag1 = L.mktrf
        
        * ------------ regressions over a window of 6 periods using -rangestat- --------
        * define a linear regression in Mata using quadcross() - help mata cross(), example 2
        mata:
        mata clear
        mata set matastrict on
        real rowvector myreg(real matrix Xall)
        {
            real colvector y, b, Xy
            real matrix X, XX
        
            y = Xall[.,1]                // dependent var is first column of Xall
            X = Xall[.,2::cols(Xall)]    // the remaining cols are the independent variables
            X = X,J(rows(X),1,1)         // add a constant
            
            XX = quadcross(X, X)        // linear regression, see help mata cross(), example 2
            Xy = quadcross(X, y)
            b  = invsym(XX) * Xy
            
            return(rows(X), b')
        }
        end
        
        * regressions with a constant over a rolling window of 6 periods by permno
        rangestat (myreg) stockexcessret mktrf mktrf_lag1, by(permno) interval(time -5 0) casewise
        
        * replicate results for observations 30 and 31
        list in 30/31
        regress stockexcessret mktrf mktrf_lag1 if permno == permno[30] & inrange(time, time[30]-5, time[30])
        regress stockexcessret mktrf mktrf_lag1 if permno == permno[31] & inrange(time, time[31]-5, time[31])

        Comment


        • #19
          Thanks again, Robert! The mistake was in the input data, not in the code.

          I'm wondering, however, whether I can tell -rangestat- to run regressions using heteroskedasticity-and-autocorrelation-consistent standard errors such as Newey and West standard errors.

          Comment


          • #20
            Naturally. You just need to write your own function and call it, thinking about where you put those standard errors.

            Comment


            • #21
              Thank you Clyde Schechter, you help me a lot!

              Comment


              • #22
                Dear Mr Schechter I have seen both your scripts in #2

                Originally posted by Clyde Schechter View Post
                I don't use -rolling- myself, but most of your questions can be answered from general Stata principles. I assume each observation has a numeric variable identifying the stock and that you have a date variable (which, I infer from your description is monthly). I think you want something like this:

                Code:
                tsset stock date
                levelsof stock, local(stocks)
                foreach s of local stocks {
                rolling _b, window(60) saving(betas_`s', replace) reject(e(N) < 30): ///
                regress stockexcessreturn marketexcessreturn if stock == `s'
                }
                If you need to then put all of the results for the different stocks together in a single file, you will need to once again loop over local stocks and successively append.
                and #13 and I would like to pick your brain about some alternative.
                I 've tried to put them to use in order to somehow annualise my beta. In other words i am trying to create an annual beta using monthly data (Jan 2002 to Dec2011) but the output will be calculated per year and firm.

                My coding skills are quite limited and when i try something like that for the regression it doesn't seem to work.

                Code:
                levelsof permno, local(permno)
                foreach s of local permno {
                    rolling _b, window(XX) saving(betas_`s', replace) reject(e(N) < 12):  ///
                       by year permno, sort: regress stockexcessret mktrf if permno == `s'                
                }
                Any thoughts? (i also cannot decide on the XX figure. i use 120)
                Last edited by Michalis Samarinas; 14 Oct 2017, 17:08.

                Comment


                • #23
                  it doesn't seem to work.
                  That's not helpful. I can't read your mind to figure out which of the many things that might have gone wrong actually happened. And it is unlikely anyone can help you without a sample of your data to work with, because much depends on exactly how your date variables are represented.

                  If you post back with more information, it is likely you will get a helpful, timely response. At a minimum show an example of your data using -dataex-, the exact code you ran (unless that's it in #22) and what you got back from Stata. (And if it's not obvious, explain how what you got back differs from what you wanted.

                  Also, as this is not a finance forum, and many of us, including me, are unfamiliar with financial jargon, explain what you mean by an "annualized" beta.

                  Comment


                  • #24
                    You are absolutely right. My apologies for that Professor.
                    Let me try to explain my problem a little bit better.
                    I have a dataset that looks like this:
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input long gvkey float(date fyear return rf rmrf)
                    1045 14610 2000   -.21918684 .41  -4.74
                    1045 14641 2000  -.017575145 .43   2.45
                    1045 14670 2000   -.50610864 .47    5.2
                    1045 14701 2000    .06637507 .46   -6.4
                    1045 14731 2000     -.178293  .5  -4.42
                    1045 14762 2000   -.07512063  .4   4.64
                    1045 14792 2000    .22361626 .48  -2.51
                    1045 14823 2000  -.007590169  .5   7.03
                    1045 14854 2000 -.0038167986 .51  -5.45
                    1045 14884 2000  .0019102202 .56  -2.76
                    1045 14915 2000   .020775063 .51 -10.72
                    1045 14945 2000     .1586798  .5   1.19
                    1045 14976 2001  -.002491135 .54   3.13
                    1045 15007 2001    -.1618119 .38 -10.05
                    1045 15035 2001    .05471597 .42  -7.26
                    1045 15066 2001      .081706 .39   7.94
                    1045 15096 2001   .022828516 .32    .72
                    1045 15127 2001   -.07618167 .28  -1.94
                    1045 15157 2001  -.027498914  .3  -2.13
                    1045 15188 2001   -.09420131 .31  -6.46
                    1045 15219 2001   -.51364297 .28  -9.25
                    1045 15249 2001   -.05035872 .22   2.46
                    1045 15280 2001     .1600984 .17   7.54
                    1045 15310 2001     .0430666 .15   1.61
                    1050 14610 2000   -.10536052 .41  -4.74
                    1050 14641 2000     .3285041 .43   2.45
                    1050 14670 2000   -.22314355 .47    5.2
                    1050 14701 2000    .04879016 .46   -6.4
                    1050 14731 2000   -.18232156  .5  -4.42
                    1050 14762 2000   .014184635  .4   4.64
                    1050 14792 2000  -.014184635 .48  -2.51
                    1050 14823 2000   .028170876  .5   7.03
                    1050 14854 2000   -.05715841 .51  -5.45
                    1050 14884 2000     -.194156 .56  -2.76
                    1050 14915 2000  -.074107975 .51 -10.72
                    1050 14945 2000    -.1670541  .5   1.19
                    1050 14976 2001     .3429447 .54   3.13
                    1050 15007 2001  -.032789823 .38 -10.05
                    1050 15035 2001   -.06899287 .42  -7.26
                    1050 15066 2001    .13353139 .39   7.94
                    1050 15096 2001    .04879012 .32    .72
                    1050 15127 2001   -.04879012 .28  -1.94
                    1050 15157 2001            0  .3  -2.13
                    1050 15188 2001    .04879012 .31  -6.46
                    1050 15219 2001    .05556991 .28  -9.25
                    1050 15249 2001     .6375773 .22   2.46
                    1050 15280 2001    -.1766235 .17   7.54
                    1050 15310 2001   -.06453853 .15   1.61
                    end
                    format %d date
                    format %ty fyear
                    My dataset (actually a small sample of it) as you can see is a panel dataset where gvkey is the firm identifier and date the time variable with a monthly structure. in the dataset I also incorporate fyear variable which measures fiscal year.
                    The initial query in #1 was to calculate the betas for each month using monthly observations. I also have monthly observations but I want to calculate them (the betas) per firm and year. So my goal is to run a series of regressions per year and.firm and get one beta for each firm per year and not per month.

                    I tried to adjust a little bit your code and used the following rather unsuccesfully (i didn't even get to the merging part for the rest of the code).
                    Code:
                    xtset gvkey date, monthly
                    levelsof gvkey, local(gvkey)
                    foreach s of local gvkey {
                        rolling _b, window(12) saving(betas_`s', replace) reject(e(N) < 10):  ///
                           by year gvkey, sort : regress return rf rmrf if gvkey == `s'                
                    }
                    
                    preserve // KEEP YOUR ORIGINAL DATA SET ON STANDBY
                    
                    // APPEND ALL THE BETAS FILES INTO ONE
                    levelsof gvkey, local(gvkey)
                    clear
                    tempfile building
                    save `building', emptyok
                    
                    foreach p of local gvkey {
                        append using betas_`p'
                        save `"`building'"', replace
                    }
                    // IF YOU WANT, YOU CAN ALSO SAVE THIS AS A PERMANENT FILE
                    
                    // NOW BRING BACK THE ORIGINAL DATA
                    restore
                    
                    // AND MERGE IN THE BETAS
                    gen end = month
                    merge 1:1 gvkey end using `building'
                    The result was the following message
                    Code:
                     foreach s of local gvkey {
                      2.     rolling _b, window(12) saving(betas_`s', replace) reject(e(N) < 10):  ///
                    >        by year gvkey, sort : regress return rf rmrf if gvkey == `s'                
                      3. }
                    by may not be used in this context
                    r(199);
                    I believe that use of xtset is problematic along with the use of the window and the by option. However I cannot wrap my head around this and figure out an alternative.
                    I would really appreciate any help on this.

                    Thank you in advance and my apologies for any mistakes as well as the size of the post.

                    Comment


                    • #25
                      The main obstacle you are facing is that although the natural unit of time in your data is the month, your dates are Stata daily dates. That makes it impossible to designate 12 months for the rolling interval because 12 months does not consistently correspond to any specific number of days. So the first step you must take to resolve this problem is to extract monthly dates.

                      Your attempt to revise the syntax failed because you cannot use the -by:- prefix with the -rolling:- prefix in the same command. It is, in any case, unnecessary. If you want to use -rolling- to solve this, the -(gvkey year)- goes as an option between -rolling- and the colon ( : ).

                      That said, a long time has passed since the start of this thread, and -rolling- has been effectively superseded by Robert Picard, Nick Cox, & Roberto Ferrer's -rangestat- command for this. (You must run -ssc install rangestat- to get this command, if you haven't already.)

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input long gvkey float(date fyear return rf rmrf)
                      1045 14610 2000   -.21918684 .41  -4.74
                      1045 14641 2000  -.017575145 .43   2.45
                      1045 14670 2000   -.50610864 .47    5.2
                      1045 14701 2000    .06637507 .46   -6.4
                      1045 14731 2000     -.178293  .5  -4.42
                      1045 14762 2000   -.07512063  .4   4.64
                      1045 14792 2000    .22361626 .48  -2.51
                      1045 14823 2000  -.007590169  .5   7.03
                      1045 14854 2000 -.0038167986 .51  -5.45
                      1045 14884 2000  .0019102202 .56  -2.76
                      1045 14915 2000   .020775063 .51 -10.72
                      1045 14945 2000     .1586798  .5   1.19
                      1045 14976 2001  -.002491135 .54   3.13
                      1045 15007 2001    -.1618119 .38 -10.05
                      1045 15035 2001    .05471597 .42  -7.26
                      1045 15066 2001      .081706 .39   7.94
                      1045 15096 2001   .022828516 .32    .72
                      1045 15127 2001   -.07618167 .28  -1.94
                      1045 15157 2001  -.027498914  .3  -2.13
                      1045 15188 2001   -.09420131 .31  -6.46
                      1045 15219 2001   -.51364297 .28  -9.25
                      1045 15249 2001   -.05035872 .22   2.46
                      1045 15280 2001     .1600984 .17   7.54
                      1045 15310 2001     .0430666 .15   1.61
                      1050 14610 2000   -.10536052 .41  -4.74
                      1050 14641 2000     .3285041 .43   2.45
                      1050 14670 2000   -.22314355 .47    5.2
                      1050 14701 2000    .04879016 .46   -6.4
                      1050 14731 2000   -.18232156  .5  -4.42
                      1050 14762 2000   .014184635  .4   4.64
                      1050 14792 2000  -.014184635 .48  -2.51
                      1050 14823 2000   .028170876  .5   7.03
                      1050 14854 2000   -.05715841 .51  -5.45
                      1050 14884 2000     -.194156 .56  -2.76
                      1050 14915 2000  -.074107975 .51 -10.72
                      1050 14945 2000    -.1670541  .5   1.19
                      1050 14976 2001     .3429447 .54   3.13
                      1050 15007 2001  -.032789823 .38 -10.05
                      1050 15035 2001   -.06899287 .42  -7.26
                      1050 15066 2001    .13353139 .39   7.94
                      1050 15096 2001    .04879012 .32    .72
                      1050 15127 2001   -.04879012 .28  -1.94
                      1050 15157 2001            0  .3  -2.13
                      1050 15188 2001    .04879012 .31  -6.46
                      1050 15219 2001    .05556991 .28  -9.25
                      1050 15249 2001     .6375773 .22   2.46
                      1050 15280 2001    -.1766235 .17   7.54
                      1050 15310 2001   -.06453853 .15   1.61
                      end
                      format %d date
                      format %ty fyear
                      
                      gen monthly_date = mofd(date)
                      format monthly_date %tm
                      
                      rangestat (reg) return rf rmrf, by(gvkey) interval(monthly_date -12 -1)
                      Note: I have interpreted the 12 month window to mean from 12 months preceding the current month through the month immediately preceding. If your intention is rather to go from 11 months preceding the current month through (including) the current month, change -12 -1 to -11 0.

                      This will give you rolling 12-month window regression for every observation. You also seem to want to somehow do this by year, but I can't make any sense out of that. Since a year only has 12 months, there is no way to roll a 12 month window through it. You can do a separate regression for each fiscal year if you like. That would be done like this:

                      Code:
                      capture program drop yearly_beta
                      program define yearly_beta
                          regress return rf rmrf
                          foreach v of varlist rf rmrf {
                              gen fy_b_`v' = _b[`v']
                              gen fy_se_`v' = _se[`v']
                          }
                          gen fy_reg_nobs = e(N)
                          gen fy_reg_r2 = e(r2)
                          gen fy_reg_adj_r2 = e(r2_a)
                          exit
                      end
                      
                      runby yearly_beta, by(gvkey fyear) verbose
                      -runby- is written by Robert Picard and me, and is also available from SSC. Note that if you want use -runby-, it requires that you already have -rangestat- installed, so get both. Note that you may want to omit the -verbose- option from the command when you run this on the full data set to avoid generating reams of unwanted regression output.

                      Note that I have used somewhat cumbersome names for the result variables in this code. That is because these names are under your control, whereas those created by -rangestat- are not, and in case you want to do both, it is necessary to avoid a name clash between their results. If you are only doing the gvkey fyear regressions, you can simlify all of those names by eliminating the fy_ prefixes.

                      I think these two approaches will resolve your problems. Do read -help datetime- and the linked section of the PDF Documentation for Stata. If you are going to work with panel data, it is essential to be conversant with the various internal date representations and be able to convert from one to another, and recognize which one is the appropriate one for your particular problem. It's a long read, and you won't remember it all, but the ones that you use most often will stick in your mind, and the others that you use less frequently you will at least know exist. Then you can refer back to the help file for the details of those.
                      Last edited by Clyde Schechter; 15 Oct 2017, 16:58.

                      Comment


                      • #26
                        Small correction to #25: you can use -runby- without having -rangestat- installed. I was thinking of -rangerun-, a different program that does require it.

                        Comment


                        • #27
                          Thank you Professor for your valuable input.
                          The second approach seems to describe in the best way my goal and works.
                          When i run the following:
                          Code:
                          xtset gvkey monyear2, monthly
                          capture program drop yearly_beta
                          program define yearly_beta
                              regress exret rmrf
                              foreach v of varlist rf rmrf {
                                  gen fy_b_`v' = _b[`v']
                                  gen fy_se_`v' = _se[`v']
                              }
                              gen fy_reg_nobs = e(N)
                              gen fy_reg_r2 = e(r2)
                              gen fy_reg_adj_r2 = e(r2_a)
                              exit
                          end
                          
                          runby yearly_beta, by(gvkey fyear) verbose
                          I can see the results on my screen, table after table. So I am certain it works. However I encounter two problems.
                          After the completion of the code there are no observations saved, So no beta saved, or appended in my initial Data Editor which is now completely empty.
                          Secondly i cannot figure a way to adjust the code and run (or keep) only the regressions that have 12 observations.

                          Thank you very much in advance.

                          Comment


                          • #28
                            Well, first of all, you have introduced some new variable monyear2 that was not in your original data and tried to -xtset- the data with it. Let me point out that to do this with -runby-, there is no need to -xtset- the data at all, least of all with a non-existent variable.

                            Removing that, there is another problem, and it is more severe: you are also regressing a non-existent variable exret inside program yearly_beta. So none of the regressions actually run, and, so, there is nothing left in your data at the end:

                            Code:
                            . * Example generated by -dataex-. To install: ssc install dataex
                            . clear
                            
                            . input long gvkey float(date fyear return rf rmrf)
                            
                                        gvkey       date      fyear     return         rf       rmrf
                              1. 1045 14610 2000   -.21918684 .41  -4.74
                              2. 1045 14641 2000  -.017575145 .43   2.45
                              3. 1045 14670 2000   -.50610864 .47    5.2
                              4. 1045 14701 2000    .06637507 .46   -6.4
                              5. 1045 14731 2000     -.178293  .5  -4.42
                              6. 1045 14762 2000   -.07512063  .4   4.64
                              7. 1045 14792 2000    .22361626 .48  -2.51
                              8. 1045 14823 2000  -.007590169  .5   7.03
                              9. 1045 14854 2000 -.0038167986 .51  -5.45
                             10. 1045 14884 2000  .0019102202 .56  -2.76
                             11. 1045 14915 2000   .020775063 .51 -10.72
                             12. 1045 14945 2000     .1586798  .5   1.19
                             13. 1045 14976 2001  -.002491135 .54   3.13
                             14. 1045 15007 2001    -.1618119 .38 -10.05
                             15. 1045 15035 2001    .05471597 .42  -7.26
                             16. 1045 15066 2001      .081706 .39   7.94
                             17. 1045 15096 2001   .022828516 .32    .72
                             18. 1045 15127 2001   -.07618167 .28  -1.94
                             19. 1045 15157 2001  -.027498914  .3  -2.13
                             20. 1045 15188 2001   -.09420131 .31  -6.46
                             21. 1045 15219 2001   -.51364297 .28  -9.25
                             22. 1045 15249 2001   -.05035872 .22   2.46
                             23. 1045 15280 2001     .1600984 .17   7.54
                             24. 1045 15310 2001     .0430666 .15   1.61
                             25. 1050 14610 2000   -.10536052 .41  -4.74
                             26. 1050 14641 2000     .3285041 .43   2.45
                             27. 1050 14670 2000   -.22314355 .47    5.2
                             28. 1050 14701 2000    .04879016 .46   -6.4
                             29. 1050 14731 2000   -.18232156  .5  -4.42
                             30. 1050 14762 2000   .014184635  .4   4.64
                             31. 1050 14792 2000  -.014184635 .48  -2.51
                             32. 1050 14823 2000   .028170876  .5   7.03
                             33. 1050 14854 2000   -.05715841 .51  -5.45
                             34. 1050 14884 2000     -.194156 .56  -2.76
                             35. 1050 14915 2000  -.074107975 .51 -10.72
                             36. 1050 14945 2000    -.1670541  .5   1.19
                             37. 1050 14976 2001     .3429447 .54   3.13
                             38. 1050 15007 2001  -.032789823 .38 -10.05
                             39. 1050 15035 2001   -.06899287 .42  -7.26
                             40. 1050 15066 2001    .13353139 .39   7.94
                             41. 1050 15096 2001    .04879012 .32    .72
                             42. 1050 15127 2001   -.04879012 .28  -1.94
                             43. 1050 15157 2001            0  .3  -2.13
                             44. 1050 15188 2001    .04879012 .31  -6.46
                             45. 1050 15219 2001    .05556991 .28  -9.25
                             46. 1050 15249 2001     .6375773 .22   2.46
                             47. 1050 15280 2001    -.1766235 .17   7.54
                             48. 1050 15310 2001   -.06453853 .15   1.61
                             49. end
                            
                            . format %d date
                            
                            . format %ty fyear
                            
                            . 
                            . capture program drop yearly_beta
                            
                            . program define yearly_beta
                              1.     regress exret rmrf
                              2.     foreach v of varlist rf rmrf {
                              3.         gen fy_b_`v' = _b[`v']
                              4.         gen fy_se_`v' = _se[`v']
                              5.     }
                              6.     gen fy_reg_nobs = e(N)
                              7.     gen fy_reg_r2 = e(r2)
                              8.     gen fy_reg_adj_r2 = e(r2_a)
                              9.     exit
                             10. end
                            
                            . 
                            . runby yearly_beta, by(gvkey fyear) verbose
                            variable exret not found
                            variable exret not found
                            variable exret not found
                            variable exret not found
                            
                            --------------------------------------
                            Number of by-groups    =             4
                            by-groups with errors  =             4
                            by-groups with no data =             0
                            Observations processed =            48
                            Observations saved     =             0
                            --------------------------------------
                            The variables you use inside the program yearly_beta have to be the actual variables in your data set. So if we change exret back to rf, the way it was in #25, there is still a problem, and this one is my error. It isn't possible to calculate b[rf] and se[rf] because rf is the dependent variable. So the code should be:

                            Code:
                            . capture program drop yearly_beta
                            
                            . program define yearly_beta
                              1.     regress rf rmrf
                              2.         gen fy_b_rmrf = _b[rmrf]
                              3.         gen fy_se_rmrf = _se[rmrf]
                              4.     gen fy_reg_nobs = e(N)
                              5.     gen fy_reg_r2 = e(r2)
                              6.     gen fy_reg_adj_r2 = e(r2_a)
                              7.     exit
                              8. end
                            
                            . 
                            . runby yearly_beta, by(gvkey fyear) verbose
                            
                                  Source |       SS           df       MS      Number of obs   =        12
                            -------------+----------------------------------   F(1, 10)        =      0.81
                                   Model |  .001767413         1  .001767413   Prob > F        =    0.3897
                                Residual |  .021857585        10  .002185758   R-squared       =    0.0748
                            -------------+----------------------------------   Adj R-squared   =   -0.0177
                                   Total |  .023624998        11  .002147727   Root MSE        =    .04675
                            
                            ------------------------------------------------------------------------------
                                      rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    rmrf |  -.0023347   .0025963    -0.90   0.390    -.0081197    .0034503
                                   _cons |   .4742918   .0139598    33.98   0.000     .4431874    .5053961
                            ------------------------------------------------------------------------------
                            
                                  Source |       SS           df       MS      Number of obs   =        12
                            -------------+----------------------------------   F(1, 10)        =      0.31
                                   Model |  .003920557         1  .003920557   Prob > F        =    0.5915
                                Residual |   .12754611        10  .012754611   R-squared       =    0.0298
                            -------------+----------------------------------   Adj R-squared   =   -0.0672
                                   Total |  .131466668        11  .011951515   Root MSE        =    .11294
                            
                            ------------------------------------------------------------------------------
                                      rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    rmrf |  -.0030828   .0055603    -0.55   0.591    -.0154719    .0093064
                                   _cons |   .3098164   .0332133     9.33   0.000     .2358126    .3838202
                            ------------------------------------------------------------------------------
                            
                                  Source |       SS           df       MS      Number of obs   =        12
                            -------------+----------------------------------   F(1, 10)        =      0.81
                                   Model |  .001767413         1  .001767413   Prob > F        =    0.3897
                                Residual |  .021857585        10  .002185758   R-squared       =    0.0748
                            -------------+----------------------------------   Adj R-squared   =   -0.0177
                                   Total |  .023624998        11  .002147727   Root MSE        =    .04675
                            
                            ------------------------------------------------------------------------------
                                      rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    rmrf |  -.0023347   .0025963    -0.90   0.390    -.0081197    .0034503
                                   _cons |   .4742918   .0139598    33.98   0.000     .4431874    .5053961
                            ------------------------------------------------------------------------------
                            
                                  Source |       SS           df       MS      Number of obs   =        12
                            -------------+----------------------------------   F(1, 10)        =      0.31
                                   Model |  .003920557         1  .003920557   Prob > F        =    0.5915
                                Residual |   .12754611        10  .012754611   R-squared       =    0.0298
                            -------------+----------------------------------   Adj R-squared   =   -0.0672
                                   Total |  .131466668        11  .011951515   Root MSE        =    .11294
                            
                            ------------------------------------------------------------------------------
                                      rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    rmrf |  -.0030828   .0055603    -0.55   0.591    -.0154719    .0093064
                                   _cons |   .3098164   .0332133     9.33   0.000     .2358126    .3838202
                            ------------------------------------------------------------------------------
                            
                            --------------------------------------
                            Number of by-groups    =             4
                            by-groups with errors  =             0
                            by-groups with no data =             0
                            Observations processed =            48
                            Observations saved     =            48
                            --------------------------------------
                            . list in 1/5, noobs clean
                            
                                gvkey        date   fyear      return    rf    rmrf   fy_b_rmrf   fy_se_~f   fy_reg~s   fy_~g_r2   fy_r~j_r2  
                                 1045   01jan2000    2000   -.2191868   .41   -4.74   -.0023347   .0025963         12   .0748111   -.0177078  
                                 1045   01feb2000    2000   -.0175751   .43    2.45   -.0023347   .0025963         12   .0748111   -.0177078  
                                 1045   01mar2000    2000   -.5061086   .47     5.2   -.0023347   .0025963         12   .0748111   -.0177078  
                                 1045   01apr2000    2000    .0663751   .46    -6.4   -.0023347   .0025963         12   .0748111   -.0177078  
                                 1045   01may2000    2000    -.178293    .5   -4.42   -.0023347   .0025963         12   .0748111   -.0177078  
                            
                            . list in -5/l, noobs clean
                            
                                gvkey        date   fyear      return    rf    rmrf   fy_b_rmrf   fy_se_~f   fy_reg~s   fy_~g_r2   fy_r~j_r2  
                                 1050   01aug2001    2001    .0487901   .31   -6.46   -.0030828   .0055603         12   .0298217   -.0671961  
                                 1050   01sep2001    2001    .0555699   .28   -9.25   -.0030828   .0055603         12   .0298217   -.0671961  
                                 1050   01oct2001    2001    .6375773   .22    2.46   -.0030828   .0055603         12   .0298217   -.0671961  
                                 1050   01nov2001    2001   -.1766235   .17    7.54   -.0030828   .0055603         12   .0298217   -.0671961  
                                 1050   01dec2001    2001   -.0645385   .15    1.61   -.0030828   .0055603         12   .0298217   -.0671961
                            I take it you are playing around with different data sets. When you change the variable names in the data, you have to change them correspondingly in yearly_beta as well.

                            By the way, if you have a lot of different gvkey/fyear combinations, you probably don't want all the regressions flying across your screen. So remove the -verbose- option from the -runby- command and you will just get the progress report and results summary, with the results you want left in the data set.

                            To restrict your results to those gvkey/fyear combinations where there are at least 10 observations:

                            Code:
                            foreach v of varlist fy_b_rmrf fy_se_rmrf fy_*r2 {
                                replace `v' = . if fy_n_obs < 10
                            }

                            This last is just basic data management in Stata. If you are not familiar with the use of -foreach-, replace-, and -if- then you need to step back from your project and take the time to learn the fundamentals of using Stata. Click on PDF Documentation in Stata's Help menu. Then follow the links for the Getting Started [GS] and User's Guide [U] volumes. Read them in their entirety. It's a lot of material, and you won't retain everything. But you will be exposed to the overall approach that Stata uses for data management and analysis, and the routine commands that are involved in using Stata. From there you will generally be able to figure out which commands are likely to play a role in solving your problems, and you can refer back to the help files for details that you don't remember.

                            Comment


                            • #29
                              Thank you for all your advise Professor. I will see those from the start.
                              Regarding the code I used verbose to check whether it works or not. You newer version is correct but still it saves (at least in my original dataset) only the first year for the first company.
                              Probably there is something that I do not do right.

                              Due to that i went towards a different direction to acquire results. Given that i use exret as my dependent variable,
                              I generated a variable to populate in order to restrict the results and asking for at least 11 obs per regression.
                              Code:
                              *create a variable measuring obs
                              egen n_obs = count(exret), by (gvkey fyear)
                              Then I used the statsby command to acquire the beta. I believe that it probably works
                              Code:
                              statsby _b, basepop(n_obs>10) by(gvkey fyear) ///
                              saving(C:\beta results.dta, replace) : reg exret rmrf
                              Thank you very much for all your valuable input and advise.

                              Comment


                              • #30
                                Hi there,

                                I was wondering whether it would be possible to specify a 'stepsize' in the Rangestat command, similar to that in Rolling.
                                My method is identical to that in this post (#1): I have a panel of monthly returns for a large number of stocks, and need to estimate the market beta of each stock (slope of regression of stock's excess return on the market excess return).
                                Currently, I've adopted the Rangestat code in #11 and it works well.

                                However, the code takes very long to run due to the large panel as well as the fact that I'm running the code several times for robustness analyses (e.g., using different window sizes/intervals).
                                The thing is that I only need Rangestat to calculate the beta for each stock in June (month==6).
                                With a stepsize=12 option, the market betas would be calculated for June in year1 and then move on to June in year2, etc.

                                Is there a way to do so? Any help is appreciated!

                                Comment

                                Working...
                                X