How can I know the information about the variable inside the function during G-estimation (stgest function)

Jeaho Jeong

Join Date: Aug 2023

Posts: 3
#1

How can I know the information about the variable inside the function during G-estimation (stgest function)

07 Aug 2023, 20:02

Hi,

I had one question while using the "stgest" function (from package "st0014") that implements G-estimation.
( paper : http://www.stata-journal.com/sjpdf.html?articlenum=st0014 )

Following the example of this paper, we proceeded with the following code after data preprocessing.

Here is my code:
------------------------------------------------------------------------------------------------------------------------------------------------------------
stgest cumul_dose_x_lag10_fac sex byr_fac st_dur_fac employ_status2 attained_age, ///
visit(visit_lag10) firstvis(2) lasttime(death_enddate_lag10) ///
lagconf(cumul_dose_x_lag10_fac employ_status2) baseconf(cumul_dose_x_lag10_fac employ_status2) ///
range(-1 3) round(5) detail
------------------------------------------------------------------------------------------------------------------------------------------------------------

Through this, we can estimate Psi.

Here is a result for a specific Psi during stgest operation (psi=0.875) :
------------------------------------------------------------------------------------------------------------------------------------------------------------
Iterating: psi=.875

note: Bcumul_dose_x_l != 0 predicts success perfectly;
Bcumul_dose_x_l omitted and 232162 obs not used.

note: Lcumul_dose_x_l != 0 predicts success perfectly;
Lcumul_dose_x_l omitted and 145509 obs not used.

Logistic regression Number of obs = 218461
LR chi2(8) = 15656.93
Prob > chi2 = 0.0000
Log likelihood = -64611.144 Pseudo R2 = 0.1081

------------------------------------------------------------------------------
cumul_~0_fac | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
__000002 | 1.047979 .1190886 0.41 0.680 .8387355 1.309423
sex | .6544185 .0113893 -24.36 0.000 .6324723 .6771261
byr_fac | .8480488 .007727 -18.09 0.000 .8330385 .8633295
st_dur_fac | 2.00697 .0272058 51.39 0.000 1.95435 2.061007
employ_sta~2 | .934167 .0709492 -0.90 0.370 .804964 1.084108
attained_age | .9211914 .001515 -49.91 0.000 .9182268 .9241655
Bemploy_st~2 | .9885807 .0554435 -0.20 0.838 .8856729 1.103445
Lemploy_st~2 | 1.340548 .1210205 3.25 0.001 1.123152 1.600023
------------------------------------------------------------------------------

psi=.875 Z= 0.41 esthi=1 estlo=.875 upphi=3 upplo=1 lowhi=.75 lowlo=.5 stop=0
estOK=0, lowOK=0, hiOK=0
------------------------------------------------------------------------------------------------------------------------------------------------------------

Logistic regression analysis is performed for each psi candidate group, and a variable called "__000002" is generated internally.

I want to know how this variable is generated, but I'm having a hard time because I can't check the stgest internal function.

cf>

I think it is __000002 corresponding to "failure time if continuously unexposed" in above paper, so I generated "blip" variable separately using a specific psi value (psi=0.875) and compared it with the result of __000002 as follows :
------------------------------------------------------------------------------------------------------------------------------------------------------------
gen pyr0 = exp(0.875 * cumul_dose_x_lag10_fac) * (_t - _t0)
egen blip = sum(pyr0), by(id)
logistic cumul_dose_x_lag10_fac blip ///
sex byr_fac st_dur_fac employ_status2 attained_age Lcumul_dose_x_l Bcumul_dose_x_l Lemploy_status2 Bemploy_status2, noconstant

note: Lcumul_dose_x_l != 0 predicts success perfectly;
Lcumul_dose_x_l omitted and 377671 obs not used.

note: Bcumul_dose_x_l omitted because of collinearity.

Logistic regression Number of obs = 218,461
Wald chi2(8) = 80553.44
Log likelihood = -60460.658 Prob > chi2 = 0.0000

----------------------------------------------------------------------------------------
cumul_dose_x_lag10_fac | Odds ratio Std. err. z P>|z| [95% conf. interval]
-----------------------+----------------------------------------------------------------
blip | 1.081741 .0009855 86.25 0.000 1.079812 1.083675
sex | .49373 .0087529 -39.81 0.000 .4768693 .5111869
byr_fac | 1.006331 .0054412 1.17 0.243 .9957228 1.017052
st_dur_fac | .9048915 .0130845 -6.91 0.000 .8796063 .9309035
employ_status2 | 1.941149 .1662512 7.74 0.000 1.641183 2.295941
attained_age | .9434696 .000603 -91.05 0.000 .9422885 .9446521
Lcumul_dose_x_l | 1 (omitted)
Bcumul_dose_x_l | 1 (omitted)
Lemploy_status2 | .8224561 .0828631 -1.94 0.052 .6750773 1.00201
Bemploy_status2 | 1.88445 .1146915 10.41 0.000 1.672549 2.123198
----------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------

Because the results of the "_0000022" variable and the "blip" variable are different, it seems that the meaning "_0000022" is different from the equation in the paper.

How do I know what "__0000022" is generated within the "stgest function"?

Is there a way to check the configuration in the "stgest function"? (excluding findit)
Tags: G-estimation, regression, stgest
Jeaho Jeong

Join Date: Aug 2023

Posts: 3
#2

08 Aug 2023, 19:07

Sorry, i mistake that posted it on the wrong forum. I will repost it on the General Forum.
Comment

Announcement

How can I know the information about the variable inside the function during G-estimation (stgest function)

Comment