Hi everyone,
I'm trying to generate forecasts in a rolling window setting using the random forest command "rforest". I'll post the code I am using at the bottom of this message. The problem is with the line "rforest `depvar' `lagged_vars' `if', type(reg)" (highlighted as "// PROBLEM LINE:" in the code). When I run the code with the line as it is, i.e., in the form specified below, I keep getting error messages for each iteration (see the attached screenshot S1) and an empty output file (see screenshot S2).
Interestingly, the code works when I replace this line by any of the following:
rolling, window (500): rforest [example dependent variable] [example independent variables] `if', type(reg) works without any problems (see screenshot S3).
For reference, I've also attached the dataset I am using. I am using STATA/MP 17.0.
Could anyone help me with getting the code to work? Many thanks in advance, much appreciated!
Best
Kilian
*****
clear
cd [my directory]
use [my datafile]
tsset month
local window_size 100
local horizons = "1 3 6 12 24"
local depvars "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth"
local indepvars1 "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth"
local indepvars2 "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth r_ind_nodur_VW r_ind_durbl_VW r_ind_manuf_VW r_ind_enrgy_VW r_ind_hitec_VW r_ind_telcm_VW r_ind_shops_VW r_ind_hlth_VW r_ind_utils_VW r_ind_other_VW"
...
local indepvars6 [set 6 of independent variables]
// GENERATE FORECASTS
* DEFINE PROGRAM
* Drop the program if it already exists (if running algorithm multiple times)
capture program drop myforecast
program myforecast, rclass
syntax [if], depvar(string) indepvars(string) horizon(integer)
// Generate list of lagged independent variables
local lagged_vars = ""
foreach var of varlist `indepvars' {
local lagged_vars = "`lagged_vars' L`horizon'`var'"
}
// PROBLEM LINE:
rforest `depvar' `lagged_vars' `if', type(reg)
// Find last time period of estimation sample and make forecast for period just after that
summ month if e(sample)
local last = r(max)
predict pred_value if inrange(month, `last' + 1, .)
// Evaluate the forecast for the specific period
scalar fcast_result = pred_value[`last']
return scalar forecast = fcast_result
// Next period's actual return (will return missing value for final period)
return scalar actual = `depvar'[`last'-`horizon']
end
* EXECUTE PROGRAM: Generate Forecasts
foreach depvar of local depvars {
forvalues i = 1/6 {
local indepvars = "`indepvars`i''"
foreach horizon of local horizons {
di "`depvar' `horizon'M forecast with indepvars`i'"
rolling actual=r(actual) forecast=r(forecast), window(`window_size') saving("FORECASTS A1 `depvar' indepvars`i' RForest `horizon'M.dta", replace): myforecast , depvar(`depvar') indepvars(`indepvars') horizon(`horizon')
}
}
}
I'm trying to generate forecasts in a rolling window setting using the random forest command "rforest". I'll post the code I am using at the bottom of this message. The problem is with the line "rforest `depvar' `lagged_vars' `if', type(reg)" (highlighted as "// PROBLEM LINE:" in the code). When I run the code with the line as it is, i.e., in the form specified below, I keep getting error messages for each iteration (see the attached screenshot S1) and an empty output file (see screenshot S2).
Interestingly, the code works when I replace this line by any of the following:
- the OLS regression command "reg": reg `depvar' `lagged_vars' `if'
- the Ridge command: elasticnet linear `depvar' `lagged_vars' `if', alpha(0) selection(cv)
- and the LASSO command: elasticnet linear `depvar' `lagged_vars' `if', alpha(1) selection(cv)
rolling, window (500): rforest [example dependent variable] [example independent variables] `if', type(reg) works without any problems (see screenshot S3).
For reference, I've also attached the dataset I am using. I am using STATA/MP 17.0.
Could anyone help me with getting the code to work? Many thanks in advance, much appreciated!
Best
Kilian
*****
clear
cd [my directory]
use [my datafile]
tsset month
local window_size 100
local horizons = "1 3 6 12 24"
local depvars "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth"
local indepvars1 "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth"
local indepvars2 "cpi_MoMgrwth IP_MoMgrwth UR_MoMgrwth r_ind_nodur_VW r_ind_durbl_VW r_ind_manuf_VW r_ind_enrgy_VW r_ind_hitec_VW r_ind_telcm_VW r_ind_shops_VW r_ind_hlth_VW r_ind_utils_VW r_ind_other_VW"
...
local indepvars6 [set 6 of independent variables]
// GENERATE FORECASTS
* DEFINE PROGRAM
* Drop the program if it already exists (if running algorithm multiple times)
capture program drop myforecast
program myforecast, rclass
syntax [if], depvar(string) indepvars(string) horizon(integer)
// Generate list of lagged independent variables
local lagged_vars = ""
foreach var of varlist `indepvars' {
local lagged_vars = "`lagged_vars' L`horizon'`var'"
}
// PROBLEM LINE:
rforest `depvar' `lagged_vars' `if', type(reg)
// Find last time period of estimation sample and make forecast for period just after that
summ month if e(sample)
local last = r(max)
predict pred_value if inrange(month, `last' + 1, .)
// Evaluate the forecast for the specific period
scalar fcast_result = pred_value[`last']
return scalar forecast = fcast_result
// Next period's actual return (will return missing value for final period)
return scalar actual = `depvar'[`last'-`horizon']
end
* EXECUTE PROGRAM: Generate Forecasts
foreach depvar of local depvars {
forvalues i = 1/6 {
local indepvars = "`indepvars`i''"
foreach horizon of local horizons {
di "`depvar' `horizon'M forecast with indepvars`i'"
rolling actual=r(actual) forecast=r(forecast), window(`window_size') saving("FORECASTS A1 `depvar' indepvars`i' RForest `horizon'M.dta", replace): myforecast , depvar(`depvar') indepvars(`indepvars') horizon(`horizon')
}
}
}
Comment