I just upgraded to Stata 18 and I was adding splines into a model using the new makespline command and I've come across two issues that make it less user-friendly than the old mkspline version. I'm writing this post as a general heads up to users of Stata 18 (and a request to Stata to consider adding old features back into Makespline). It took me a very long (and painful) afternoon to figure out why I was getting different results between the two packages with the same number of knots so I'm just writing this post in case anyone else encounters a similar issue.
1) In the newer Stata 18 makespline version the "basis(varname)" on Makespline doesn't include the first knot, whereas in the older mkspline this is included by default. Without having the first knot automatically included (ex: varname_1, ...varname_k), it's easy when adding it into a regression model to just enter varname* and miss the first knot, and misspecify your splines (i.e. drop one out of the model and get different overall predictions). The model coefficients if you specify varname* will run varname_2, ... varname_k .
2) in mkspline, there is an option to display the points at which the knots in the spline lie. This is helpful for adding into a footnote of a table and understanding how the splines are being specified/what the effect of the "harrel" option has on the knot placement. I can't find this feature in the new version.
Here's an example below of the issue (base code modified from a post by Maarten Buis - linked here: https://www.stata.com/statalist/arch.../msg00311.html
__________________________________________________ _______
log using statalist_ex_splines
sysuse nlsw88, clear
gen ln_w = ln(wage)
*Old version mkspline (restricted cubic spine with 5 knots, the display shows that the knots are correctly specified at Harrell 2001 percentiles)
mkspline ten_mk = tenure, cubic nknots(5) displayknots
reg ln_w ten_mk*
predict yhat_mk
label var yhat_mk "Older - Mkspline"
*Newer makespline, specified as above. Restricted cubic spline with 5 knots, placed as per Harrel (2001)
makespline rcs tenure, knots(5) basis(ten_make) order(3) harrell
reg ln_w ten_make* //nb: only 3 betas included vs 4 when using mkspline with the same number of knots
predict yhat_make
label var yhat_make "Stata 18 - Makespline"
* Output plot of predicted values based on splines
sort tenure
twoway line yhat_mk tenure, lcolor(blue) lpattern(dash) sort || ///
line yhat_make tenure, lcolor(orange) sort ///
title(Fitted values for the effect of tenure on ln(wage)) subtitle(mkspline and makespline confusion)
*** Really not clear why restricted cubic splines from mkspline and makespline differ !!
*Updated model, now including original base variable "tenure"
reg ln_w tenure ten_make*
predict yhat_make_basevar
label var yhat_make_basevar "Stata 18 - Makespline with basevar"
twoway line yhat_mk tenure, lcolor(blue) lpattern(dash) sort || ///
line yhat_make tenure, lcolor(orange) sort || ///
line yhat_make_basevar tenure, lcolor(red) lpattern(dash) sort ///
title(Fitted values for the effect of tenure on ln(wage)) subtitle(mkspline vs makespline demo)
*** now the Makespline version is the same as mkspline output.
1) In the newer Stata 18 makespline version the "basis(varname)" on Makespline doesn't include the first knot, whereas in the older mkspline this is included by default. Without having the first knot automatically included (ex: varname_1, ...varname_k), it's easy when adding it into a regression model to just enter varname* and miss the first knot, and misspecify your splines (i.e. drop one out of the model and get different overall predictions). The model coefficients if you specify varname* will run varname_2, ... varname_k .
2) in mkspline, there is an option to display the points at which the knots in the spline lie. This is helpful for adding into a footnote of a table and understanding how the splines are being specified/what the effect of the "harrel" option has on the knot placement. I can't find this feature in the new version.
Here's an example below of the issue (base code modified from a post by Maarten Buis - linked here: https://www.stata.com/statalist/arch.../msg00311.html
__________________________________________________ _______
log using statalist_ex_splines
sysuse nlsw88, clear
gen ln_w = ln(wage)
*Old version mkspline (restricted cubic spine with 5 knots, the display shows that the knots are correctly specified at Harrell 2001 percentiles)
mkspline ten_mk = tenure, cubic nknots(5) displayknots
reg ln_w ten_mk*
predict yhat_mk
label var yhat_mk "Older - Mkspline"
*Newer makespline, specified as above. Restricted cubic spline with 5 knots, placed as per Harrel (2001)
makespline rcs tenure, knots(5) basis(ten_make) order(3) harrell
reg ln_w ten_make* //nb: only 3 betas included vs 4 when using mkspline with the same number of knots
predict yhat_make
label var yhat_make "Stata 18 - Makespline"
* Output plot of predicted values based on splines
sort tenure
twoway line yhat_mk tenure, lcolor(blue) lpattern(dash) sort || ///
line yhat_make tenure, lcolor(orange) sort ///
title(Fitted values for the effect of tenure on ln(wage)) subtitle(mkspline and makespline confusion)
*** Really not clear why restricted cubic splines from mkspline and makespline differ !!
*Updated model, now including original base variable "tenure"
reg ln_w tenure ten_make*
predict yhat_make_basevar
label var yhat_make_basevar "Stata 18 - Makespline with basevar"
twoway line yhat_mk tenure, lcolor(blue) lpattern(dash) sort || ///
line yhat_make tenure, lcolor(orange) sort || ///
line yhat_make_basevar tenure, lcolor(red) lpattern(dash) sort ///
title(Fitted values for the effect of tenure on ln(wage)) subtitle(mkspline vs makespline demo)
*** now the Makespline version is the same as mkspline output.
Comment