Regression: Variance of the error term

Bram Vorstenbosch

Join Date: Nov 2015

Posts: 2
#1

Regression: Variance of the error term

05 Nov 2015, 17:32

Dear all,

I have a dataset containing roughly 200 companies with daily stock data for 10 years.
The variables are: date, companyid, Ri_Rft, B_Ret, SMB, HML
I need to run reg Ri_Rft B_Ret SMB HML for every company in the sample monthly.
After this I need to save the Variance of the Error Term as a new variable.

I have a the following code set up:

Code:

gen resid=. levelsof id, local(groups) foreach a of local groups { quietly reg Ri_Rft B_Ret SMB HML if id==`a' tempvar d predict `d', stdp replace resid=`d' if id==`a' }

However, I have two problems with this setup.
First, I am not sure if the "predict, stdp" command achieves my goal of saving the variance of the error term.
Will the new variable 'resid' contain the variance of the error term?
Second, this code only works if I reduce my sample to roughly half the companies, or else it gives an error: no room to add more variables.
Is this solved simply by using set maxvar and how does this work? Where should I place it in my code?

Kind Regards,
Bram van Vorstenbosch
Tags: None

1 like
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

05 Nov 2015, 18:23

-predict, stdp- will give you the standard error of prediction for each observation, not the variance of the error term. To get that, predict the residuals, and then use -summarize- to get its variance. As for losing room to add more variables, the problem is that you create a new tempvar each time you go through the loop, and then they just pile up and pile up.

So try something like this:

Code:

gen error_variance = . levelsof id, local(groups) foreach a of local groups { quietly reg Ri_Rft B_Ret SMB HML if id == `a' predict resid if id == `a', resid quietly summ resid, detail replace error_variance = r(Var) if id == `a' drop resid }
1 like
Comment
Doug Hemken

Join Date: Jul 2014

Posts: 219
#3

05 Nov 2015, 21:14

If all you need is var(error), rather than all the individual errors, it would make sense to just post the RMSE to a new file. That wouldn't eat up all you memory.

Doug Hemken
SSCC, Univ. of Wisc.-Madison
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#4

06 Nov 2015, 01:01

No loops necessary:

Code:

// open example data sysuse nlsw88, clear // compute the standard deviation of the error for each occupation statsby rmse=e(rmse) , by(occupation) clear : reg wage ttl_exp grade i.race //admire the result list

(For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq )

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Bram Vorstenbosch

Join Date: Nov 2015

Posts: 2
#5

07 Nov 2015, 12:42

Thank you for the responses.

Code:

levelsof id, local(groups)

Using this gave me a macro lenght exceeded error. But the second approach did work.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#6

08 Nov 2015, 13:49

Using this gave me a macro lenght exceeded error. But the second approach did work.

Really? Even if you are using small Stata, a macro can hold 51,800 characters (-help limits-). If you are using IC, it's 264,392, and for SE/MP it's over 4,000,000. You said in your original post you had about 200 companies. Even allowing for some of the characters in the macro to be taken up by spaces or quotation marks, that should still leave well over 200 characters per firm name on average in small Stata, and you can't even begin to approach the limits of IC or SE/MP.

I'm glad you were able to solve you problem without this anyway, but I'm mystified that you encountered this error message and wondering how it is possible.
Comment

Announcement

Regression: Variance of the error term

Comment

Comment

Comment

Comment

Comment