Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate market betas for stocks

    Hi all,
    I just started using STATA, so I don't have much knowledge about it.
    My problem is the following:
    I have monthly return data for all NYSE stocks for 40 years and have to calculate an individual beta for each stock on a rolling basis.
    So for a particular stock and a particular month, I want to regress the returns from the last 5 years on the market returns.
    For example stock Apple: beta in January 1980 would be a regression of Apple's returns from January 1975 to December 1979 on the market returns during this period.
    The beta for February 1980 would then include date from February 1975 to January 1980.
    I know that I have to work with the following command: rolling _b, window(60) saving(betas_`i', replace): regress stockexcessreturn marketexcessreturn
    However, I don't know how to apply this command exactly to my problem so that STATA is calculating the beta for each month (is that already included in the rolling command?).
    And how can I include that STATA should should just use the returns from a particular stock to calculate its beta and not from other stocks? I think you can't combine the command rolling with by companyID, can you?
    I have to include another restriction where I am unsure how to code this: STATA should just calculate the beta for a particular stock if at least 30 return observation were available during the last 60 months (the 5year period).
    I know that these are a lot of questions! However, it would make me really happy if someone can help me out...
    Thank you very, very much in advance!

  • #2
    I don't use -rolling- myself, but most of your questions can be answered from general Stata principles. I assume each observation has a numeric variable identifying the stock and that you have a date variable (which, I infer from your description is monthly). I think you want something like this:

    Code:
    tsset stock date
    levelsof stock, local(stocks)
    foreach s of local stocks {
        rolling _b, window(60) saving(betas_`s', replace) reject(e(N) < 30):  ///
           regress stockexcessreturn marketexcessreturn if stock == `s'
    }
    If you need to then put all of the results for the different stocks together in a single file, you will need to once again loop over local stocks and successively append.

    Comment


    • #3
      Thank you very much Mr. Schechter, I really appreciate your help!
      It seems that the code is correct, I already applied it.
      For the first 13 stocks, the regressions are run properly.
      However, if it comes to the 14th stock, the loop stops because it shows the following error:
      rolling rejected results from regress while using the entire dataset
      r(9);
      I clicked on r(9) to read more about the error, but I didn't understand what the mistake is.
      Do you have any idea how I can fix this error so that the loop continues and that the rolling regressions are run for all stocks?
      Thanks again for your help!!!

      Comment


      • #4
        r(9) is one of those non-specific error messages that programmers can respond to but non-programmer users get little out of. It just means that -rolling- was trying to verify an assumption that needs to be true of the data for the program to work properly, and the assumption turned out to be false. (It almost always means that some -assert- statement failed. But what that means from the user's perspective depends on which assert statement it was, and that isn't shown in the message. Actually, even if the failed assert statement were echoed to the output, it probably wouldn't mean much to the user.)

        I'm just guessing here, but I'm guessing that at some point with stock 14, -regress- came up with no observations in some window or something like that. Try running -regress stockexcessreturn marketexcessreturn if stock == 14- outside of -rolling- and see what happens. Follow that up with -tab date if e(sample)- to see if there are just not enough months of available data on stock 14. (Remember if either of the regression variables has a missing value, it can't be included in the regression.)


        There are ways around this, but they would basically cover up the problem that rolling has revealed. So I suggest you first try to figure out if the data on stock 14 is deficient in some way. If it is, you may want to proactively look for other stocks whose data is deficient.

        Comment


        • #5
          Thanks again Mr. Schechter!!!
          I am currently on vacation with limited WIFI access. That's why I am responding so lately.
          I will check my model once I will be back home in about two weeks.
          I will keep you updated then about my model.
          Would really appreciate it if you are still available for help in two weeks!

          Comment


          • #6
            John, thanks for your vote of confidence in me. But remember, too, that this is a public forum and there are many members who are frequent responders. While there does seem to be a general tendency for one responder to pick up a question and stick it out through the follow-ups, it also happens that sometimes the initial responder stops and somebody else jumps in. So, even if I am, for some reason, unavailable when you get back, you should think of yourself as dealing with the Forum as a whole, not just with me.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Code:
              tsset stock date
              levelsof stock, local(stocks)
              foreach s of local stocks {
              rolling _b, window(60) saving(betas_`s', replace) reject(e(N) < 30): ///
              regress stockexcessreturn marketexcessreturn if stock == `s'
              }
              If you need to then put all of the results for the different stocks together in a single file, you will need to once again loop over local stocks and successively append.
              Thank you very much for the code above and generally for your awesome work here, helped me a lot!

              Could you or someone else please explain how to store the betas for the different stocks in a single file or as another column in my current stata file? For the first case, I would also have to have the date and company identifier in this newly created file so that I can merge it again with my main stata file. Eventually, this would enable me to calculate each company's cost of equity for each month applying the CAPM.

              Edit:
              Forgot to mention that I have panel data, thus I use "xtset permno dm" to declare the company identifier (permno) and the date variable for each month (dm). My data is organized such that I have date in the first column (starting with Jan '96), company identifier in the second column and variables used for my analysis (e.g. stock and market excess retuns) in additional columns. This gives me all data for a specific company in a specified time frame and data for the next company is directly below starting again with, for example, January 1996.
              Last edited by Julian Simpson; 01 Jul 2016, 12:00.

              Comment


              • #8
                So, it appears that the following correspondence obtains between the variable names in your data set and the variable names in my code:

                Code:
                stock <-> permno
                date <-> dm
                Assuming that's correct, to get all your data together, starting with your own data in memory:

                Code:
                levelsof permno, local(permnos)
                foreach p of local permno {
                    merge 1:1 permno dm using betas`p', nogenerate
                }
                The betas`p' files are the ones generated by the code in #2 (copied in #7), modified to reflect the variable names actually in your data.

                Comment


                • #9
                  Dear Clyde,

                  thanks again for your help and sorry for the late reply, I tried to work on the code and solve the problems. However, it did not work and I am unable to spot the mistake. This is my code right now:

                  Code:
                  *Regression of stock returns (HPR) on CRSP value-weighted return (incl. dividends)
                  levelsof permno, local(permno)
                  foreach s of local permno {
                      rolling _b, window(60) saving(betas_`s', replace) reject(e(N) < 24):  ///
                         regress stockexcessret mktrf if permno == `s'                
                  }
                  levelsof permno, local(permnos)
                  foreach p of local permno {
                      merge 1:1 permno dm using betas`p', nogenerate
                  }

                  The regression part produces hundreds of individual files labelled "betas_[PERMNO]" where PERMNO is substituted by the respective stock's identifier. Those files are saved in the same folder in which my main stata file is located. Below I uploaded a screenshot of the data editor of one of those files.

                  Click image for larger version

Name:	Betas.JPG
Views:	1
Size:	41.2 KB
ID:	1348339


                  If I run the code all at once, it takes quite long to calculate all the betas and afterwards I get the following text:
                  Code:
                  . foreach p of local permno {
                    2.     merge 1:1 permno dm using betas`p', nogenerate
                    3. }
                  file betas10002.dta not found
                  r(601);
                  
                  end of do-file
                  If I run only the second (merge) part again, basically nothing happens:
                  Code:
                  . foreach p of local permno {
                    2.     merge 1:1 permno end using betas`p', nogenerate
                    3. }
                  
                  .
                  end of do-file
                  I tried changing "betas`p'" to "betas_`p'" to better reflect the individual files' names but this did not help either.

                  It'd be perfect if I had the values of the column _b_mktrf, which is now in hundreds of individual files, in my main stata file in one column merged by permno and end since end describes the month for which the beta is estimated.

                  Comment


                  • #10
                    OK, there are a few typos in the code that are trippping you up (my typos originally). And running the -foreach- loop without the immediately preceding -levelsof- command will, as you noticed, do nothing because the local macro is undefined. But, in any case, now that I look at the data, -merge-ing these files together is not what you need to do. You need to append them all.

                    Code:
                    preserve // KEEP YOUR ORIGINAL DATA SET ON STANDBY
                    
                    // APPEND ALL THE BETAS FILES INTO ONE
                    clear
                    tempfile building
                    save `building', emptyok
                    
                    levelsof permno, local(permnos)
                    foreach p of local permnos {
                        append using betas_`p'
                        save `"`building'"', replace
                    }
                    // IF YOU WANT, YOU CAN ALSO SAVE THIS AS A PERMANENT FILE
                    
                    // NOW BRING BACK THE ORIGINAL DATA
                    restore
                    
                    // AND MERGE IN THE BETAS
                    gen end = month
                    merge 1:1 permno end using `building'
                    Note: I assume your original data set has variables permno and month that uniquely identify all its observations. To link the data for that month with the end month in the betas files, you have to either rename month to end or, as done here, make a a copy by the name of end.

                    I think this does what you want. I can't test this because I don't have a suitable data set to try it out on, so there may still be some typos lurking, but I tried to be careful.

                    Comment


                    • #11
                      Let me point out that rangestat (from SSC) can perform the same task and generate identical results. It is several orders of magnitude faster than the alternative and the betas are saved in the original data file. To install rangestat, type in Stata's Command window:

                      Code:
                      ssc install rangestat
                      The example below uses a commonly used Stata dataset with the names of variables adjusted to match the example in #7. Because there are fewer observations in the test data, the window has been reduced to 6 periods. Results are rejected if the are less than 4 observations. There's code at the end to verify that the rolling results are identical to the ones generated by rangestat. And since rangestat returns the betas in the original dataset, there's no need for additional data management gymnastics.

                      Code:
                      * test data; use variable names in the thread
                      webuse grunfeld, clear
                      rename company permno
                      rename invest stockexcessret
                      rename mvalue mktrf
                      save "test_data.dta", replace
                      
                      * ------------ regressions over a window of 6 periods using -rangestat- --------
                      * define a linear regression in Mata using quadcross() - help mata cross(), example 2
                      mata:
                      mata clear
                      mata set matastrict on
                      real rowvector myreg(real matrix Xall)
                      {
                          real colvector y, b, Xy
                          real matrix X, XX
                      
                          y = Xall[.,1]                // dependent var is first column of Xall
                          X = Xall[.,2::cols(Xall)]    // the remaining cols are the independent variables
                          X = X,J(rows(X),1,1)         // add a constant
                          
                          XX = quadcross(X, X)        // linear regression, see help mata cross(), example 2
                          Xy = quadcross(X, y)
                          b  = invsym(XX) * Xy
                          
                          return(rows(X), b')
                      }
                      end
                      
                      * regressions with a constant over a rolling window of 6 periods by permno
                      rangestat (myreg) stockexcessret mktrf, by(permno) interval(time -5 0) casewise
                      
                      * the Mata function returns first the number of observations and then as many
                      * variables as there are independent variables (plus the constant) for the betas
                      rename (myreg1 myreg2 myreg3) (nobs rs_mktrf rs_cons)
                      
                      * reject results if the window is less than 6 or if the number of obs < 4
                      isid permno year
                      by permno: replace rs_mktrf = . if _n < 6 | nobs < 4
                      by permno: replace rs_cons = . if _n < 6 | nobs < 4
                      save "rangestat_results.dta", replace
                      
                      * ----------------- replicate using -rolling- ----------------------------------
                      use "test_data.dta", clear
                      levelsof permno, local(permno)
                      foreach s of local permno {
                          rolling _b, window(6) saving(betas_`s', replace) reject(e(N) < 4):  ///
                             regress stockexcessret mktrf if permno == `s'                
                      }
                      
                      clear
                      save "betas.dta", replace emptyok
                      foreach s of local permno {
                          append using "betas_`s'.dta"
                      }
                      rename end year
                      merge 1:1 permno year using "rangestat_results.dta"
                      isid permno year, sort
                      
                      gen diff_mktrf =  abs(_b_mktrf - float(rs_mktrf))
                      gen diff_cons =  abs(_b_cons - float(rs_cons))
                      summ diff*

                      Comment


                      • #12
                        Unfortunately, I get the following error:
                        Click image for larger version

Name:	error 111.JPG
Views:	1
Size:	44.8 KB
ID:	1348368





                        Originally posted by Clyde Schechter View Post
                        Note: I assume your original data set has variables permno and month that uniquely identify all its observations. To link the data for that month with the end month in the betas files, you have to either rename month to end or, as done here, make a a copy by the name of end.
                        At least I got that and even tried it on your last code. I am starting to learn, albeit too slowly.


                        Edit: Crosspost with Robert

                        Comment


                        • #13
                          Sorry, my mistake. That -levelsof- command needs to be earlier, before the -clear- command.
                          Code:
                          preserve // KEEP YOUR ORIGINAL DATA SET ON STANDBY
                          
                          // APPEND ALL THE BETAS FILES INTO ONE
                          levelsof permno, local(permnos)
                          clear
                          tempfile building
                          save `building', emptyok
                          
                          foreach p of local permnos {
                              append using betas_`p'
                              save `"`building'"', replace
                          }
                          // IF YOU WANT, YOU CAN ALSO SAVE THIS AS A PERMANENT FILE
                          
                          // NOW BRING BACK THE ORIGINAL DATA
                          restore
                          
                          // AND MERGE IN THE BETAS
                          gen end = month
                          merge 1:1 permno end using `building'
                          Remember that all of this code needs to be run as a block: you will encounter errors if you run it a few lines at a time. There are commands using local macros scattered throughout, so these different commands cannot be run separately as the macros will disappear.

                          If this version does not work, I think that before I try again, you should post a small representative sample of your data. It is complicated to try to visualize everything going on when I have nothing to work with and look at. If I have a representative data sample, I can find these errors before I post back to you and send you something that is tested and known to work (at least with your example data). Do use the -dataex- command to post the data. See FAQ #12 for details about that.

                          Comment


                          • #14
                            This has generally worked, thank you very much. However, for 30 companies for which the regressions did not work perfectly, as it could be seen by the red XXX during the calculation, there is a problem. Those are at the end of the file now with the first columns all empty and only the betas in the last columns. I checked this for one observations and it was probably caused by a discontinuity in the data, i.e. a jump in the date variable of a few years. I need to check those observations now.

                            But again, thank you very much for you awesome help and also to Robert whose idea I have not tested yet. This work is immensely important to me and time is running low, thus I am really grateful for your support.

                            Comment


                            • #15
                              Hey guys,

                              I was wondering how to modify the above-mentioned -rangestat- code to run a rolling regression on two independent variables, i.e. to have something like:
                              Code:
                              reg stockexcessret mktrf mktrf_lag1
                              In particular:
                              Q1: How do I adjust the Mata quadcross() code to tell Mata that now I want to have two independent variables (instead of just one)?

                              Q2: Then I would later use something like:
                              Code:
                              rangestat (myreg) stockexcessret mktrf mktrf_lag1, by(permno) interval(time -5 0) casewise
                              Correct?

                              Help is highly appreciated.

                              Best,
                              Christopher

                              Comment

                              Working...
                              X