Hello everyone, I am working on the synthetic panel which is formed from repeated cross-sectional data. Data is after every two years starting from 2004 to 2016. I want to ask do i need to make same cohorts for every round? For instance, i have considered people age 26 or more up till 50 or less. For this purpose, the oldest cohort in the first round is 1954 and youngest cohort in the last round becomes 1986. Keeping in mind 2004 (survey year) - 50 = 1954, 2016(survey year) - 26 = 1986. So i made the cohorts like this by taking 5 years gap for each cohort, and put same cohorts for all the rounds. So i can study age effect. My code for cohort construction or cohort bin is same for all the years which is mentioned below. I wanted to ask if it is a correct way to do it? because i am getting some weird results. Apart from this, my oldest cohort finishes at 1954, but in my code, the range goes till 1951 by the logic of the code. Please see the code below and suggest me if it is correct way to do it.
This code is for a first round 2004, and it is same for 2006, 2008, 2010, 2012, 2014, and 2016.
Code:
//"1" 26-30 years, "2" 31-35 years, "3" 36-40 years, "4" 41-45 years, "5" 46-50 years. drop if age > 50 drop if age < 25 | age == 25 gen year= 2004 gen cohort= year - age summarize cohort, d //recode cohort(1974/1978=1)(1969/1973=2)(1964/1968=3)(1959/1963=4)(1954/1958=5), gen(cbin) recode cohort(1986/1990=1) (1981/1985=2) (1976/1980=3) (1971/1975=4) (1966/1970=5) (1961/1965=6) (1956/1960=7) (1951/1955=8), gen(cbin) gen c_age= 28 if cbin==1 //cbin stands for cohort bin, and we put the median number 28, from the age range of 26 to 30 replace c_age=33 if cbin==2 replace c_age=38 if cbin==3 replace c_age=43 if cbin==4 replace c_age=48 if cbin==5 replace c_age=53 if cbin==6 replace c_age=58 if cbin==7 replace c_age=63 if cbin==8
Comment