how to calculate output of r(alpha) (short example included)?

new user rosie

Join Date: Apr 2023

Posts: 1
#1

how to calculate output of r(alpha) (short example included)?

26 Apr 2023, 19:47

My deepest apologies if this is difficult to work with. I'm not a stata user, i'm just trying to decipher existing code

Input
year etype category value avg_freq

1996 1 1 25526966.0 8

1997 1 1 3266403.0 8

1998 1 1 1870628.0 8

Code:
import excel "\\inputdata.xlsx", firstrow case(lower)

levelsof category, l(smoothingcategory)

foreach i of local smoothingcategory {
keep if etype==1 & value != . & category==`i'

tsset year
tssmooth exponential smoothed=value, samp0(2) forecast(1)
gen parameter=r(alpha)

if (parameter > .5 & avg_freq > 50) {
drop smoothed parameter
tssmooth exponential smoothed=value, parms(.5) samp0(2) forecast(1)
gen parameter=r(alpha)
}
else if (avg_freq < 50 & parameter > .1) {
drop smoothed parameter
tssmooth exponential smoothed=value, parms(.1) samp0(2) forecast(1)
gen parameter=r(alpha)
}

replace category=`i' if missing(category)
}

Output
Can someone explain how "smoothed" and "parameter" are calculated? anyone have a formula they can share?
year etype category value avg_freq smoothed parameter

1996 1 1 25526966 8 14396684 0.0001305

1997 1 1 3266403 8 14398137 0.0001305

1998 1 1 1870628 8 14396684 0.0001305
Tags: None
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

26 Apr 2023, 20:06

Alpha is the smoothing parameter from tsmooth exponential. I don't know what it means since I've never used it, I would just read the help file for tsmooth exponential.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#3

26 Apr 2023, 22:11

Though not an answer to the question you posed, I would like to point out that I'm nearly certain that

Code:

if (parameter > .5 & avg_freq > 50) { drop smoothed parameter tssmooth exponential smoothed=value, parms(.5) samp0(2) forecast(1) gen parameter=r(alpha) } else if (avg_freq < 50 & parameter > .1) { drop smoothed parameter tssmooth exponential smoothed=value, parms(.1) samp0(2) forecast(1) gen parameter=r(alpha) }

is not going to do what you think it will do. My overall impression of this code is that your intent is to apply an exponential smoother to three different subsets of your data: the subset with parameter > .5 & avg_freq > 50, which appears in the -if- command, the subset with avg_freq < 50 & parameter > .1, which appears in the -else if- command, and the rest of the data set, which is dealt with in the block of code that precedes the -if- command.

That is a reasonable thing to want to do, but it isn't even close to what the code actually does. The -if- command does not apply the commands within the associated curly braces to a subset of the data. The -if- command makes a single decision as to whether the code in the curly braces is executed at all. If it is executed, it applies to everything in the data set (unless the commands in the braces themselves contain appropriate restrictions). The criterion that the -if- command uses to make this decision usually does not involve the mention of variables. But when variables are mentioned, since different observations will have different values, the rule is that the first observation of the data set is used. So what your code actually does is look at the first observation in the data set, and only that first one. If its parameter value is greater than .5 and its avg_frequency value is greater than 50, then the -tssmooth- with parms(.5) is applied to the entire data set. If its avg_frea < 50 and its parameter > .1, the -tssmooth- with parms(.1) is applied to the entire data set, and, in any other configuration of the first observation in the data set, the -tssmooth- with no parms() specified is applied to the entire data set.

Now, it is entirely possible that this is what you want to do--I really have no idea and no understanding of why you would want to do either thing. So if this is what you want to do, that's fine. But it is uncommon to see this kind of code being conditioned on values of the first observation in the data set, so I'm guessing you aren't getting what you thought you would. If I am right, and what you really want is to do three different things in different subsets of the data, read -help if- to see the way to do that. Stata has two different kinds of -if-, an -if- qualifier and an -if- command, and I believe you have used the wrong one for your purpose.

Last edited by Clyde Schechter; 26 Apr 2023, 22:15.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35433
#4

27 Apr 2023, 03:45

Cross-posted at https://stackoverflow.com/questions/...ample-included

Please find at https://www.statalist.org/forums/help#crossposting our request that you tell us about cross-posting and

https://www.statalist.org/forums/help#realnames our request that you use a full real name.
Comment

year	etype	category	value	avg_freq
1996	1	1	25526966.0	8
1997	1	1	3266403.0	8
1998	1	1	1870628.0	8

year	etype	category	value	avg_freq	smoothed	parameter
1996	1	1	25526966	8	14396684	0.0001305
1997	1	1	3266403	8	14398137	0.0001305
1998	1	1	1870628	8	14396684	0.0001305

Announcement

how to calculate output of r(alpha) (short example included)?

Comment

Comment

Comment