How can I segment a biography into multiple lines of 1 year age group, please?

Myriam Pean

Join Date: Apr 2024

Posts: 17
#1

How can I segment a biography into multiple lines of 1 year age group, please?

21 Jun 2024, 15:20

Hello,

I'm seeking help to create a biographical file for survival analysis. My database concerns women, and I'm particularly interested in maternity. My goal is to segment each woman's biography into multiple lines, so that there is a line for each age group (e.g., 15-16 years, 16-17 years, ..., up to 49-50 years) starting from age 15 up to age 50 at the time of the survey. In my database, I have the woman's age at the time of the survey as well as the age at which she gave birth to each child. I aim to create as many lines for each identifier (ID) as there are one-year age groups, starting from age 15 up to 50 years (or the age at the time of the survey). All respondents are at least 15 years old. If the age at the time of the survey is less than 50 years, it will be kept as the upper limit of the last age group; otherwise, it will be kept at 50 years.

Thank you in advance for your assistance.

I tried with this code, but there is an error message like using required

gen age_mere = .

qui foreach age_mere of numlist 15/50 {
replace AGEDC = cond(AGEDC >= 50, 50, AGEDC)
replace age_mere = `age_mere' if `age_mere' <= AGEDC
keep Matricule age_mere
append
}
save "fichier_long2.dta", replace * Merge fichier_long2 and fichier_long datasets in Stata use "fichier_long2.dta", clear sort Matricule age_mere use "fichier_long.dta", clear sort Matricule age_mere merge 1:1 Matricule age_mere using "fichier_long2.dta", keepusing(age_mere) nogenerate save "long3.dta", replace
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29788
#2

21 Jun 2024, 18:07

The -append- command is the on that requires you to specify what file you want to append the data in memory to. That said, I don't understand your approach in the first place, and I don't think it will produce what you are looking for.

If I understand what you have as starting data and what you want, the following code will get you there:

Code:

// AS NO EXAMPLE DATA SET WAS PROVIDED // CREATE A DEMONSTRATION DATA SET clear* set obs 15 set seed 1234 gen id = _n gen age_at_survey = runiformint(20, 80) gen age_at_maternity = runiformint(15, 45) replace age_at_maternity = . if runiform() < 0.1 // REQUESTED COMPUTATION gen last_age = min(49, age_at_maternity, age_at_survey) gen expander = last_age - 14 expand expander by id, sort: gen age = 14 + _n forvalues a = 15/50 { label define age `a' "`a'-`=`a'+1' years", add } label values age age drop last_age expander // AS NO EXAMPLE DATA SET WAS PROVIDED // CREATE A DEMONSTRATION DATA SET clear* set obs 15 set seed 1234 gen id = _n gen age_at_survey = runiformint(20, 80) gen age_at_maternity = runiformint(15, 45) replace age_at_maternity = . if runiform() < 0.1 // REQUESTED COMPUTATION gen last_age = min(49, age_at_maternity, age_at_survey) gen expander = last_age - 14 expand expander by id, sort: gen age = 14 + _n forvalues a = 15/50 { label define age `a' "`a'-`=`a'+1' years", add } label values age age drop last_age expander

If this code does not meet your requirements, when you post back, be certain to use the -dataex- command to show example data. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Myriam Pean

Join Date: Apr 2024

Posts: 17
#3

22 Jun 2024, 18:18

Thank you so much, M. Clyde! This code works.
Comment
Myriam Pean

Join Date: Apr 2024

Posts: 17
#4

22 Jun 2024, 20:33

M. Clyde,

I have successfully segmented life histories into intervals (e.g., 15-16 years, 16-17 years, ..., up to 49-50 years) using the provided code. Additionally, I have data indicating the age of each woman at the time of each childbirth. Now, I am seeking guidance on how to update the variable _d, which was created using the stset command, to have the value 1 when a childbirth event occurs within each respective age interval, and 0 otherwise.

Could you, please, advise me on how to accomplish this in Stata?

Thank you very much for your assistance.

Myriam
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29788
#5

22 Jun 2024, 22:49

At this point, a description of your data in words is insufficient to provide help. Please post back showing example data, and using the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Myriam Pean

Join Date: Apr 2024

Posts: 17
#6

22 Jun 2024, 23:44

Here is a sample:

PUMFID AGEDC AGEparite1 AGEparite2 AGEparite3 T Debut Fin Enquete _st _d _t _t0
2 56,3 32,8 36,3 .y 15-16 ans 0 56,3 1 1 1 56,299999 0
2 56,3 32,8 36,3 .y 16-17 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 17-18 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 18-19 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 19-20 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 20-21 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 21-22 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 22-23 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 23-24 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 24-25 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 25-26 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 26-27 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 27-28 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 28-29 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 29-30 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 30-31 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 31-32 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 32-33 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 33-34 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 34-35 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 35-36 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 36-37 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 37-38 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 38-39 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 39-40 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 40-41 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 41-42 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 42-43 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 43-44 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 44-45 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 45-46 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 46-47 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 47-48 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 48-49 ans 0 56,3 1 0 1
2 56,3 32,8 36,3 .y 49-50 ans 0 56,3 1 0 1
5 26,3 22,6 25,2 .y 15-16 ans 0 26,3 1 1 1 26,299999 0
5 26,3 22,6 25,2 .y 16-17 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 17-18 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 18-19 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 19-20 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 20-21 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 21-22 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 22-23 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 23-24 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 24-25 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 25-26 ans 0 26,3 1 0 1
5 26,3 22,6 25,2 .y 26-27 ans 0 26,3 1 0 1
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29788
#7

23 Jun 2024, 11:40

I specifically asked you to use -dataex- to post the example data and showed you how to go about doing that. But instead you just pasted a listing, which requires "surgery" in order to turn it into a usable example. Please repost using -dataex-. I'm happy to help you solve your problem, but you need to provide me with a usable starting point.
Comment

Myriam Pean

Join Date: Apr 2024
Posts: 17

23 Jun 2024, 21:12

Hello M. Clyde,

I tried to use dataex command, I have a the below sample, which is near the previous one that I posted. May be , I didn't use the command correctly. Can you advice, please?

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long Matricule float(AGEparite1 AGEparite2 AGEparite3 T) byte(_st _d _t0) double _t
2500002 32.8 36.3 .y 15 1 0 0 56.29999923706055
2500002 32.8 36.3 .y 16 0 0 .                 .
2500002 32.8 36.3 .y 17 0 0 .                 .
2500002 32.8 36.3 .y 18 0 0 .                 .
2500002 32.8 36.3 .y 19 0 0 .                 .
2500002 32.8 36.3 .y 20 0 0 .                 .
2500002 32.8 36.3 .y 21 0 0 .                 .
2500002 32.8 36.3 .y 22 0 0 .                 .
2500002 32.8 36.3 .y 23 0 0 .                 .
2500002 32.8 36.3 .y 24 0 0 .                 .
2500002 32.8 36.3 .y 25 0 0 .                 .
2500002 32.8 36.3 .y 26 0 0 .                 .
2500002 32.8 36.3 .y 27 0 0 .                 .
2500002 32.8 36.3 .y 28 0 0 .                 .
2500002 32.8 36.3 .y 29 0 0 .                 .
2500002 32.8 36.3 .y 30 0 0 .                 .
2500002 32.8 36.3 .y 31 0 0 .                 .
2500002 32.8 36.3 .y 32 0 0 .                 .
2500002 32.8 36.3 .y 33 0 0 .                 .
2500002 32.8 36.3 .y 34 0 0 .                 .
2500002 32.8 36.3 .y 35 0 0 .                 .
2500002 32.8 36.3 .y 36 0 0 .                 .
2500002 32.8 36.3 .y 37 0 0 .                 .
2500002 32.8 36.3 .y 38 0 0 .                 .
2500002 32.8 36.3 .y 39 0 0 .                 .
end
label values T T
label def T 15 "15-16 ans", modify
label def T 16 "16-17 ans", modify
label def T 17 "17-18 ans", modify
label def T 18 "18-19 ans", modify
label def T 19 "19-20 ans", modify
label def T 20 "20-21 ans", modify
label def T 21 "21-22 ans", modify
label def T 22 "22-23 ans", modify
label def T 23 "23-24 ans", modify
label def T 24 "24-25 ans", modify
label def T 25 "25-26 ans", modify
label def T 26 "26-27 ans", modify
label def T 27 "27-28 ans", modify
label def T 28 "28-29 ans", modify
label def T 29 "29-30 ans", modify
label def T 30 "30-31 ans", modify
label def T 31 "31-32 ans", modify
label def T 32 "32-33 ans", modify
label def T 33 "33-34 ans", modify
label def T 34 "34-35 ans", modify
label def T 35 "35-36 ans", modify
label def T 36 "36-37 ans", modify
label def T 37 "37-38 ans", modify
label def T 38 "38-39 ans", modify
label def T 39 "39-40 ans", modify

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29788
#9

23 Jun 2024, 22:26

Well, if it were just a matter of changing the value of _d, this would be very simple. But then you would have an invalid -stset- of your data and when you tried to use it for any of the -st- commands they would either break altogether, or give you erroneous results. What I believe you really want to do is -stset- this data so as to allow for multiple failures (giving birth events) for each subject. That's a little bit more complicated, but not all that hard:

Code:

stset, clear gen byte year_giving_birth = 0 forvalues i = 1/3 { replace year_giving_birth = 1 if floor(AGEparite`i') == T } stset T, id(Matricule) failure(year_giving_birth == 1) exit(time .)
Comment
Myriam Pean

Join Date: Apr 2024

Posts: 17
#10

24 Jun 2024, 18:53

Mr. Clyde, thank you very much. Your code accomplishes exactly what I intended to do; I was really struggling with it before.
Comment
Myriam Pean

Join Date: Apr 2024

Posts: 17
#11

04 Jul 2024, 21:55

Hello Mr Clyde,

Please could you help me with this issue? I used the following code to generate a graphic, but the figure doesn't represent some ages (the first part of the graphic is blind for X axe value 0 to 20) although I specify it should start from 0. I think that it could be an issue with the structure of my data, but I couldn't see where. Here is the code used:

sts graph, hazard by(Periode3a_3b) ///
title(" Évolution du risque d'avoir un enfant selon la période ") ///
xtitle("Nombre d'années écoulées depuis le quinzième anniversaire") xscale(range(0 35)) xlabel(0(5)35) ///
legend(col(3) rowgap(*.50) symxsize(*.5) size(*.75)) ///
note("Femmes nées à partir de 1945." ///
"Statistique Canada, Enquête sociale générale sur la famille de 2017.") ///
caption("Lissage par la méthode des fenêtres de Parzen.")
graph save ".gph", replace
graph export ".pdf", replace

In addition, I must specify that I want to perfom a risk Analysis with a Poisson Model. Here is a part of the code that I used:

******************** Création de la variable classe d'âge appelée 'T' ********************

gen age_max = min(49, AGEDC)
gen expander = age_max - 14
expand expander
by Matricule, sort: gen T = 14 + _n
forvalues age = 15/50 {
label define T `age' "`age'-`=`age'+1' ans", add
}
label values T T
drop age_max expander

gen byte agenaissm = 0
forvalues i = 1/3 {
replace agenaissm = 1 if floor(AGEparite`i') == T
}

******** Préparation du fichier biographique *******

stset T, id(Matricule) failure(agenaissm == 1) origin(time 14) exit(time 49)

order Matricule _t0 _t _st _d

.
.
.

* La regression avec seulement la double différence *
************************************************** *********

poisson _d ibn.T i.Periode3a_3b##i.Traitement, exposure(TempsÀRisque) irr vce(cluster Matricule) noconstant

poisson _d c.T c.T#c.T i.Periode3a_3b##i.Traitement, exposure(TempsÀRisque) irr vce(cluster Matricule)

matrix b = get(_b)
matrix V = get(VCE)
matrix V = V*1.58
ereturn post b V
ereturn display, eform(IRR)

Thank you for your help.

Myriam
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29788
#12

05 Jul 2024, 10:21

I will need more information in order to help you. First, I will need example data (use -dataex-, of course) that is rich enough to run the code for the graph and produces a graph exhibiting the same problem you are encountering. Next, I don't understand "blind for X axe 0 to 20." Please clarify what that means. Perhaps also posting the graph so that the problem can be seen would help.

With regard to the Poisson model, what is your question? You show some code, but you don't say what problem you are having with it.
Comment

Announcement

How can I segment a biography into multiple lines of 1 year age group, please?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment