Dear Statalist members,
This topic is an extended version of another topic I had posted "How to plot a restricted cubic spline among 2 groups using a logistic regression model fitted on a case control data
with extra info (data set, codes, graph). I have closed that thread with a comment.
I am trying to plot a restricted cubic spline graph for a continuous variable stratified by another binary variable. My broad aim is visualization of interaction between these two variables. I am planing to do this by observing the change in curves among categories of the binary variable. Base data is case-control. If you have any better suggestion, you are most welcome to advice. I am giving below the details of variables used, the data set generated by dataex, syntax I have used in Stata 13, graph generated, and my 3 queries.
Variables: outcome - binary; ENdrink- binary, ever never variable for alcohol consumption; drinkL- main exposure alcohol in liters, continuous variable; drinkL_C- centered drinkL uisng the mean among alcohol consumers, i.e, ENdrink=1. Other covariates include X - binary; edu- continuous; smoke- continuous.
My dataset has 7 variables and 819 observations. I was not able to paste the code created using dataex here due to character restriction. I am pasting below an uploaded link of the same. I hope you can access this. if not, please suggest a way to upload the data.
Thekke purakkal_dataex code alcohol RCspline .docx
Syntax I have tried out
//FINDING KNOT POSITIONS AT 5, 50th AND 95th PERCENTILES OF drinkL_C
centile(drinkL_C) if ENdrink==1, centile(5 50 95)
// TRANSFORMING drinkL_C TO SPLINE FUNCTION WITH THE 3 KNOTS AT POSITIONS DERIVED ABOVE
//STORING THE ESTIMATES
mkspline2 RCS3 = drinkL_C, cubic knots( -609.0704 -375.2204 1578.77) displayknots
mat knots3 = r(knots)
// FITTING THE LOGISTIC REGRESSION MODEL USING THE SPLINE VARIABLES RCS3*. MODEL SHOULD CONTAIN THE ENdrink variable.
//STORING THE ESTIMATES
logistic outcome RCS3* edu smoke ENdrink
estimates store cknots3
// ESTIMATING POINTWISE ODDS RATIOS TO USE IN THE GRAPH. REFERENCE IS ZERO
xbrcspline RCS3, values(-614.5905(30)8210.63) matknots(knots3) ref(0) gen(drink3 or3 lb3 ub3)
// RUG PLOT TO SHOW OBSERVATION DENSITY - CREATING A VARIABLE WHICH DENOTES THE POSITION OF RUG PLOT IN THE GRAPH
// CREATING A SYMBOL FOR RUG PLOT
gen where = -3
gen pipe = "|"
// PLOTTING THE GRAPH - Linear(blue) and Restricted cubic spline(red) in the same graph. 0.0000879 USED IN THE CODE IS THE POINT ESTIMATE OF drinkL_C OBTAINED FROM LOGISTIC REGRESSION MODEL ASSUMING LINEARITY OF ALCOHOL (logistic outcome drinkL_C edu smoke ENdrink)
twoway (line lb3 ub3 or3 drink3, lp(- - l) lc(black black red)) || function y= (.0000831*x), ra(drink3) clpat(dash) lc(blue) lp(-) || scatter where drinkL_C, ms(none) mlabel(pipe) mlabpos(0)

My Queries are:
1) x- axis has centered life time consumption of alcohol. Y axis has odds ratios. but originally there are no drinkers with value below 0. So how can i transform this graph from 0 and above?
2)Most of the observations, as observed from the rug plot are concentrated below 5000L of alcohol. How can I restrict my x-axis and graph to around 5000 without dropping those observations? Also how to add axis titles
3) The above graph shows linear and RC spline in a single graph for the whole data set . Now I want to know how these curves behave among the 2 categories A and B of variable X, represented in the same graph, i.e, this graph stratified by variable X. What changes should I make in the above code for introduction of this variable X.
Thankyou for your consideration
This topic is an extended version of another topic I had posted "How to plot a restricted cubic spline among 2 groups using a logistic regression model fitted on a case control data
with extra info (data set, codes, graph). I have closed that thread with a comment.
I am trying to plot a restricted cubic spline graph for a continuous variable stratified by another binary variable. My broad aim is visualization of interaction between these two variables. I am planing to do this by observing the change in curves among categories of the binary variable. Base data is case-control. If you have any better suggestion, you are most welcome to advice. I am giving below the details of variables used, the data set generated by dataex, syntax I have used in Stata 13, graph generated, and my 3 queries.
Variables: outcome - binary; ENdrink- binary, ever never variable for alcohol consumption; drinkL- main exposure alcohol in liters, continuous variable; drinkL_C- centered drinkL uisng the mean among alcohol consumers, i.e, ENdrink=1. Other covariates include X - binary; edu- continuous; smoke- continuous.
My dataset has 7 variables and 819 observations. I was not able to paste the code created using dataex here due to character restriction. I am pasting below an uploaded link of the same. I hope you can access this. if not, please suggest a way to upload the data.
Thekke purakkal_dataex code alcohol RCspline .docx
Syntax I have tried out
//FINDING KNOT POSITIONS AT 5, 50th AND 95th PERCENTILES OF drinkL_C
centile(drinkL_C) if ENdrink==1, centile(5 50 95)
// TRANSFORMING drinkL_C TO SPLINE FUNCTION WITH THE 3 KNOTS AT POSITIONS DERIVED ABOVE
//STORING THE ESTIMATES
mkspline2 RCS3 = drinkL_C, cubic knots( -609.0704 -375.2204 1578.77) displayknots
mat knots3 = r(knots)
// FITTING THE LOGISTIC REGRESSION MODEL USING THE SPLINE VARIABLES RCS3*. MODEL SHOULD CONTAIN THE ENdrink variable.
//STORING THE ESTIMATES
logistic outcome RCS3* edu smoke ENdrink
estimates store cknots3
// ESTIMATING POINTWISE ODDS RATIOS TO USE IN THE GRAPH. REFERENCE IS ZERO
xbrcspline RCS3, values(-614.5905(30)8210.63) matknots(knots3) ref(0) gen(drink3 or3 lb3 ub3)
// RUG PLOT TO SHOW OBSERVATION DENSITY - CREATING A VARIABLE WHICH DENOTES THE POSITION OF RUG PLOT IN THE GRAPH
// CREATING A SYMBOL FOR RUG PLOT
gen where = -3
gen pipe = "|"
// PLOTTING THE GRAPH - Linear(blue) and Restricted cubic spline(red) in the same graph. 0.0000879 USED IN THE CODE IS THE POINT ESTIMATE OF drinkL_C OBTAINED FROM LOGISTIC REGRESSION MODEL ASSUMING LINEARITY OF ALCOHOL (logistic outcome drinkL_C edu smoke ENdrink)
twoway (line lb3 ub3 or3 drink3, lp(- - l) lc(black black red)) || function y= (.0000831*x), ra(drink3) clpat(dash) lc(blue) lp(-) || scatter where drinkL_C, ms(none) mlabel(pipe) mlabpos(0)
My Queries are:
1) x- axis has centered life time consumption of alcohol. Y axis has odds ratios. but originally there are no drinkers with value below 0. So how can i transform this graph from 0 and above?
2)Most of the observations, as observed from the rug plot are concentrated below 5000L of alcohol. How can I restrict my x-axis and graph to around 5000 without dropping those observations? Also how to add axis titles
3) The above graph shows linear and RC spline in a single graph for the whole data set . Now I want to know how these curves behave among the 2 categories A and B of variable X, represented in the same graph, i.e, this graph stratified by variable X. What changes should I make in the above code for introduction of this variable X.
Thankyou for your consideration
Comment