Calculate market betas for stocks

Robert Picard

Join Date: Mar 2014
Posts: 1536

#31

24 Nov 2017, 09:52

rangestat has been upgraded since #11 was posted and now includes a (reg) statistic. If I understand what you are asking and shooting blind since I cannot see your data or the code you tried, the following will calculate a regression using a 12 month window only for observations that are in June.

Code:

* create panel of monthly stocks returns over 30 years
clear
set seed 312312
set obs 1000
gen long stockid = _n
expand 360
bysort stockid: gen mdate = ym(1986,12) + _n
format %tm mdate
gen ret = runiform()
gen mret = runiform()

* define a valid upper interval bound only when the month is june
gen high = cond(month(dofm(mdate)) == 6, mdate, -999)

* regress over a 12 month window
rangestat (reg) ret mret, by(stockid) interval(mdate -11 high)

* spot check results
list if stockid == 2 & mdate == ym(2001,6)
reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))

and the results for the spot check:

Code:

. list if stockid == 2 & mdate == ym(2001,6)

        +---------------------------------------------------------------------------------+
   534. | stockid |  mdate |      ret |     mret | high | reg_nobs |   reg_r2 | reg_adj~2 |
        |       2 | 2001m6 | .7165378 | .6462917 |  497 |       12 | .1522346 | .06745806 |
        |---------------------------------------------------------------------------------|
        |        b_mret     |        b_cons     |       se_mret      |       se_cons      |
        |     .31540325     |     .37679786     |     .23536795      |     .16262095      |
        +---------------------------------------------------------------------------------+

. reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))

      Source |       SS           df       MS      Number of obs   =        12
-------------+----------------------------------   F(1, 10)        =      1.80
       Model |  .097820681         1  .097820681   Prob > F        =    0.2099
    Residual |  .544744687        10  .054474469   R-squared       =    0.1522
-------------+----------------------------------   Adj R-squared   =    0.0675
       Total |  .642565368        11  .058415033   Root MSE        =     .2334

------------------------------------------------------------------------------
         ret |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        mret |   .3154033    .235368     1.34   0.210    -.2090292    .8398357
       _cons |   .3767979   .1626209     2.32   0.043     .0144558    .7391399
------------------------------------------------------------------------------

Comment

Johanna Moser

Join Date: Dec 2017

Posts: 4
#32

13 Dec 2017, 10:47

Hi,
I have a similar Problem to the one discussed in this thread, yet I am not supposed to use the rolling command.
My data is long, with 2137 companys and the respective Company_id from 1 to 2137 over a time period from 1995 to 2015 (weekyl data).
I do have to calculate a Regression which Looks like this:

Weekly-Excess-Return= b1*Marketfactor+b2*Hml+b3*Smb+b4*Exchange_Rate

(newey ex_return mkt_rf hml smb twexb)

Now I fail to compute a Loop, which calculates one yearly Beta per Company.
I also do have to exclude those companys, which do have more than 12 missing values per year

I would be more than happy if anyone of you could help me!
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#33

13 Dec 2017, 10:59

I don't understand what you are trying to do. I assume you have multiple observations per company per year, and several years of data for many companies. Is that right? Is your data daily, monthly, or what? These questions could be answered by posting a short representative example of your data using the -dataex- command. If you are using Stata version 15.1, -dataex- is part of your official installation. If you have an earlier version, run -ssc install dataex- to get the command. Then run -help dataex- to read the simple instructions for using it.

Most unclear is whether you want to do a rolling regression, in which each observation receives results of a regression for the observations obtained in the preceding years, or whether you simply want to do a separate single regression for each company's data for each calendar year.
2 likes
Comment

Johanna Moser

Join Date: Dec 2017
Posts: 4

#34

14 Dec 2017, 01:40

Hi Clyde,
first of all, thank you for your answer!!!!

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int company_id float(week ex_return) double(mkt_rf smb hml twexb_change_percentage)
6 1820  -1.922182   .17 -1.21  1.85                  .
6 1821  4.0626664  1.07   -.3   .13 -.0319025921651096
6 1822   3.007111  -.22   .47  -.96 -.5542856415845588
6 1823 -1.3971034   .79 -1.21   .09  .2684526551012281
6 1824 -3.1557686  1.76 -1.23   .11  .0166934967665756
6 1825   1.702802   .83  1.09  -.77  -.398238226404705
6 1826      -.099  -.03  -.37   .66  .2480528809463563
6 1827      -.099   .95  -.66   .87 -.4569744501027921
6 1828  1.6549115  -.33   .75 -1.05 -.2111390276694827
6 1829  1.6241304   .47  -.84  -.74 -.4712981869436382
6 1830      -.115   .95  -.44  -.56 -.1046784368881651
6 1831  1.5944017    .9  -.54  -.23  .2399777486939216
6 1832  2.4060085   .09   .54  1.36 -.5425828016846495
6 1833  3.1676884   .71 -1.18  1.41  -.874923088734785
6 1834  2.2699523   .73   .69  -.24 -1.114025020836144
6 1835  -5.537357  -.49  -.38  1.53 -.9998190772054948
6 1836   2.757852  1.15   .25  -.53 -.1408511913849036
6 1837 -2.1260319   .68 -1.17   .18 -.0387217579454937
6 1838  2.3050244  1.28   -.1   .65 -.3285368222462772
6 1839      -.134 -1.06  1.35   .26  1.493984971669278
6 1840  -4.895905   .68  -.58    .4  .4127188016588153
6 1841     -3.868  1.36 -1.24   .72 -1.093935111206834
6 1842 -1.8496017  -.58  2.14 -1.42   .458236478468209
6 1843  2.0846431  2.16  -.13 -1.32 -.0730010972286157
6 1844   6.778552  1.78  -.75   -.3 -.0450502304535142
6 1845  -7.376065  -.83  1.13    .2 -.0694329860945689
6 1846   2.495696  2.21  -.67   .37 -.1279917376069615
6 1847  -2.655373   .84  1.45 -1.13   .259751211377123
6 1848  -.9825652 -1.41   .28   .02  .0174858453188828
6 1849  -.9901929  2.14   .64  -.96  .0220194612202905
6 1850  -2.770867   -.8    .5  1.36  .1129493497945198
6 1851  -5.570546  -.32   .99  -.39  .6373720117661943
6 1852  -4.442923  1.16   .68   -.5  1.553472254247672
6 1853      -.116     0     0   .68  1.090733736844261
6 1854   5.922151   .64  -.44  2.01 -.0910189631752859
6 1855  2.2616682  1.91   .61 -1.06 -.0330792971773431
6 1856  -7.052444  1.31 -1.33   .14  .5664963964832207
6 1857  2.8770745  -.35   -.1 -1.71  .8852092428921295
6 1858 -3.4896426   .15 -1.08  1.45 -1.404561685347355
6 1859     -1.118  -.98  -2.2  1.41 -.0493520007365284
6 1860   6.952707   .42  -.76   .63  .1843313036539588
6 1861   -24.6463   .34    .1 -1.41  .0902320588945803
6 1862     -6.368 -1.66  -1.2  -.52 -.0712446993730011
6 1863   6.561666  2.23   .17 -1.08  .6997006016826614
6 1864      4.895   .25   .35  -.16  .6612963318752312
6 1865   5.252143   .71 -1.03   .45  .1477350975779138
6 1866  -5.189746   -.3   -.8  1.58 -.1752095354420207
6 1867  -7.264857  1.46   .67  -.58  .1490418287527331
6 1868   6.288256  1.14  -.78  -.38  .3260769967213287
6 1869 -1.3268193   -.7   .16   .58  .4610708980489516
6 1870  2.3170245  -.34   .75   .53  .0933166276018613
6 1871   7.020857   .78   .29  -.18 -.2994205773346552
6 1872  2.1152222  -.48  -.19   .59  .2020980500156001
6 1873 -3.3678696 -2.32   .12   .79  .3911540455130725
6 1874  1.5783932  1.28 -1.63  -.12   .153957456110667
6 1875   .4454862  1.67   .16 -1.02  .4569010137881136
6 1876  -.3727253  2.27  -.41  -.82  .4881379781279293
6 1877   5.755884   2.7 -1.78  -.71  -.385689339900815
6 1878  -2.244938  -.82   1.1   .59  .0982943704509014
6 1879 -2.2256596  1.61  -.41 -1.33 -.2600022923156712
6 1880 -1.1849566 -2.02  1.61  1.58 -.3573739433812891
6 1881 -2.2958024 -1.55   .11  -.26  .3787079875696428
6 1882  -7.963169  1.42   .62 -1.04  .0257729824619537
6 1883   5.999561  1.14  -.01   .67 -.1300734599436812
6 1884  -2.971563  -.52  1.17    .9  .1297248010610148
6 1885   5.211444   1.4  -.08  -.33 -.0456346003402419
6 1886 -3.4847865  -2.7  2.03  -.53  .1548764870368509
6 1887  -6.509349  1.49  1.18 -1.41  .2939756694820437
6 1888  10.445005  1.79  1.32 -1.52 -.0497799068507639
6 1889      -.106 -1.38  2.01  -.04  .0653751129112623
6 1890 -3.1958876  1.37  -.29  -.22 -.0057706853306983
6 1891   8.879507  2.62   .23  -.95  .1038783304443812
6 1892  10.000383  1.12   .41   -.1   .337460133789865
6 1893      -.106 -1.27   .86   .07  .3939867768188034
6 1894   .3830918   .17 -1.35  -.16 -.2448665906312937
6 1895 -1.0615385 -1.05   .16   .96  .3271194636593105
6 1896  -3.012621  -.86 -2.46   .57 -.1892189129051771
6 1897       -2.1   .58   .03   .21  .1150970498324158
6 1898  3.9696326 -1.98   .47    .8 -.0010219087006377
6 1899  -4.033569 -2.62 -2.91  2.93  .3228242574991119
6 1900  -6.234449 -1.07   .41  -.19 -.2889855456295752
6 1901   .4314783  -.87 -1.91   .86 -.5526763941798342
6 1902  -4.967865  3.88 -1.28  -.91 -.0426313069936908
6 1903 -1.2393637  -.06   .77  -.01 -.1184936406014964
6 1904  -5.850127   .61  -.33   .78  .0928083795060269
6 1905  1.1165122    .3   .42   .19  .0793587973647026
6 1906   4.716277 -1.54  2.47  -.14  -.157461874813189
6 1907 -2.4078505   .28     0   .17  .1498915678020208
6 1908 -1.2854706   3.4  -1.5 -1.43  .3435052948506073
6 1909 -4.2756667   .86   -.2 -1.11  .2215313302014492
6 1910  12.934478   .19   .79  -.91 -.0955055204233619
6 1911   8.685209  1.88 -1.42   .77  .2315794639625562
6 1912 -13.237313  -.33  -.02    .6 -.0087725472314976
6 1913  2.2195814   .94 -1.31   .77  .2505496074960811
6 1914  -5.219636 -1.66  -.28  2.11  .1090870977031732
6 1915  1.0956048   .04 -1.66  1.08 -.0147391725530304
6 1916  -1.877148  3.32 -2.04  -.89 -.1623581272594788
6 1917   15.56065   .78  -.48   .48 -.5026317906461209
6 1918    -1.6645  1.12 -1.01  1.47 -.0740971962808138
6 1919   .4271005  1.06   .01  -.02  .2193838202896143
end
format %tw week

I hope this gives a clearer idea of what my data looks like. In total, i have 2137 companys with weekly obeservations for the time period 1995-2015.
What I have to do is a Newey-Regression which creates yearly Beta per Company. It should NOT be rolling, hence i want to do a separate single regression for each company's data for each calender year.

Additionally companies need to have at least 40 non-missing-values in the respective year, otherwise no regression should be run in the respective year.
Thank you so much!

Last edited by Johanna Moser; 14 Dec 2017, 01:44.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#35

14 Dec 2017, 08:13

Here's the gist of it. Bear in mind that I don't know anything about -newey-. I discovered that you have to specify a lag() option to make it run. So I arbitrarily picked -lag(1)- to illustrate the approach. Change that to whatever you need. I also don't know which output from this regression you would consider "the beta." So I generated the regression coefficients and standard errors for all variables. Again, tailor this code to your actual needs.

Code:

capture program drop my_newey program define my_newey tsset week newey ex_return mkt_rf hml smb twexb, lag(1) if e(N) >= 40 { gen n_obs = e(N) foreach v of varlist mkt_rf hml smb twexb { gen b_`v' = _b[`v'] gen se_`v' = _se[`v'] } } exit end gen year = yofd(dofw(week)) runby my_newey, by(company_id year) status

The key points are:

1. -runby-, is written by Robert Picard and me, and you can install it with -ssc install runby-. It's purpose is precisely for projects like this: run the same commands on chunks of the data set and capture results as you go along. You could also do this with official -statsby-, but -runby- is much faster and, in my opinion, simpler to use.

2. The simplest way to determine whether 40 observations are available is to actually run -newey- and see how many it says it had. So that's the approach I used. -newey- seems to run very quickly on a set of up to 52 observations, so I don't think this will create much inefficiency compared to explicitly counting observations with missing values.

3. The data that -newey- is applied to in this approach is not the original data set, but is a chunk of a clone of the original data set. That chunk does not carry with it the information about how the data are -tsset- or -xtset-, so the -tsset- needs to be done inside the program my_newey. (I guess this means that it isn't really a full "clone" of the original.)
3 likes
Comment

Johanna Moser

Join Date: Dec 2017
Posts: 4

#36

14 Dec 2017, 09:50

Thank you so so much Clyde!!!!

I adjusted the code:

Code:

capture program drop my_newey
program define my_newey
    tsset week
 quietly summarize ex_return
 scalar lags = floor(4*(r(N)/100)^(2/9))
    newey ex_return mkt_rf hml smb twexb_change_percentage,lag(`=lags')
    if e(N) >= 40 {
        gen n_obs = e(N)
        foreach v of varlist mkt_rf hml smb twexb {
            gen b_`v' = _b[`v']
            gen se_`v' = _se[`v']
   gen t_`v' = _b[`v'] / _se[`v']
   gen p_`v' = 2*ttail(e(df_r), abs(_b[`v'] / _se[`v']))
        }
    }
    exit
end

gen year = yofd(dofw(week))
runby my_newey, by(company_id year) status

Only the number of errors that it displays worries me a little. Do you think there is something wrong?

Click image for larger version

Name: Stata.PNG
Views: 1
Size: 5.2 KB
ID: 1422421

Again, thank you so much for your help!

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#37

14 Dec 2017, 10:58

Yes. This suggests that there are some groups where the -newey- command cannot be executed as requested. In order to see what is going on, you can add the -verbose- option to the -runby- command. This will generate a lot of output, but you can break it off after you get some error messages coming out and you'll have a sense of what's going on. It's likely a problem with your data, perhaps groups that do not have enough non-missing observations to run with the number of lags you have calculated. (You have calculated the number of lags on the basis of the number of non-missing values of ex_return, but the estimation sample might actually be smaller than that due to missing values in the other variables.)
2 likes
Comment
Johanna Moser

Join Date: Dec 2017

Posts: 4
#38

16 Dec 2017, 07:51

Thank you so much Clyde! You are brilliant!
Comment
Arjan Trinks

Join Date: Sep 2016

Posts: 11
#39

20 Dec 2017, 05:19

@#31

Hi Robert,

Many thanks for your quick solution. This seems to work perfectly for the problem I posted. (#30)
Will save me hours of waiting (and coffee consumption).
Again, thanks so much!
Comment
Jan Scheur

Join Date: Oct 2018

Posts: 1
#40

15 Oct 2018, 07:53

Goodafternoon,

I want to calculate the betas over all the stocks in my sample, there are 600 stocks in the sample. I have stock return data for 18 years for every stock. I already calculated the beta of one stock, however, now I want to do this for the stocks in my sample. I used the following to calculate the beta:

rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg NESTLER EUROSTOXX, robust

I know I have to perform some sort of loop to repeat the command for all stocks. I have daily stock returns, the idea is that I get a beta for every stock for every year. A simplified sample looks like this:
Date NESTLER (...) EUROSTOXX

03jan2000 x x x

04jan2000 x x x

(...) (...) (...) (...)

31dec2017 x x x

Between NESTLER and EUROSTOXX are 599 other stock returns. EUROSTOXX is the index.
Thanks in advance.

Regards,
Jan Scheur
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#41

15 Oct 2018, 10:25

I know I have to perform some sort of loop to repeat the command for all stocks.

No, you don't. What you have to do is get your data into a more useful layout, and then the -rolling- command (or several other approaches that you can find in many threads here--this is a very frequently asked question) will handle it.

Code:

rename (NESTLER-whatever) ret= reshape long ret, i(Date) j(stock) string xtset stock Date

In the above code, replace whatever by the name of the last stock before EUROSTOXX.

From there, the -rolling- command, modified as follows:

Code:

rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg ret EUROSTOXX, robust

will do it.

-rolling- is by now pretty old, and there are other community-contributed commands that can be used to do this more efficiently. You can find models of these earlier in this thread. What all of them have in common, and also with -rolling-, is that the require the data to be in long layout. Your wide layout is easy on human eyes, but it doesn't play well with most Stata commands. When dealing with panel data like this, you should form the habit of working with it in long layout--that will save you much time and frustration. Wide layout is best for a handful of Stata commands, but only a handful.

In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Guest
#42

09 Apr 2020, 03:00

Hi there,

I'm not very familiar with stata yet and would like some help!
My problem is similar to that of the original post (#1), but I have some more problems.

I have three different data sets, these are the following:
The file sp500inclusions contains information about when the stock of a company was included in the S&P 500 index. Permno is a stock security identifier. Dateann refers to the date when S&P announced the inclusion of a new stock and Dateeff to the date when the stock was actually included (usually a few days later).

The file sp500stockret contains daily stock returns for each stock in sp500inclusions. For each stock it covers the period starting 1 year prior to it's Dateann until 1 year after it's Dateeff.

The file sp500marketret contains the index return of the S&P 500 (sprtrn), which serves as a proxy for the market return, as well as the yield on short-term US government debt (rf), which serves as a proxy for the risk-free rate.

For each stock I want to estimate pre- and post-inclusion betas. For pre-inclusion betas I will have to use 1 year of stock returns until 10 days prior to the inclusion announcement. For post-inclusion betas I will have to use 1 year of stock returns starting 10 days after effective inclusion.

What's the best way to handle this? I find it especially tricky how to combine the data from the different sets and how to make the rolling window work in such a way that it takes exactly the right period for the pre- and post-inclusions.

Thank you very much in advance for your help!

Best,
Guest

Last edited by sladmin; 13 Mar 2024, 08:10. Reason: anonymize original poster
Comment
Attaullah Shah

Join Date: Aug 2014

Posts: 1667
#43

09 Apr 2020, 06:22

For rolling betas, you can also explore asreg, see this video

Regards
--------------------------------------------------
Attaullah Shah, PhD.
Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
FinTechProfessor.com
https://asdocx.com
Check out my asdoc program, which sends outputs to MS Word.
For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.
Comment

Date	NESTLER	(...)	EUROSTOXX
03jan2000	x	x	x
04jan2000	x	x	x
(...)	(...)	(...)	(...)
31dec2017	x	x	x

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment