Dear Statalist Community,
I am currently undertaking a research project focusing on labor market participation and outcomes in Kosovo, with a specific emphasis on gender disparities. My objective is to measure labor market outcomes separately for men and women, investigate the gender pay gap, and explore whether parenthood is associated with either penalties (motherhood penalty) or premiums (fatherhood premium). Given that Kosovo implemented EU-SILC only since 2018, I plan to merge separate data files for each year (H, D, P, R files) into a unified dataset for each year (2018, 2019, 2020, 2021) and subsequently run regressions, including probit, mincer, and oaxaca analyses.
I am facing challenges related to demographic variables, particularly in determining the parental status of individuals. There is no direct variable indicating parental status, and I aim to generate this variable for all individuals, additionally noting those with children under 25 and 30. In addition to determining parental status, I am faced with the challenge of obtaining the number of children for each individual.
To provide context, I have a subsample with a 'dataex' of RB220 (father ID) and PB230 (mother ID), where PB030 represents the personal ID.
The EU-SILC 2021 guidelines shed light on these variables:
Your help with the syntax on creating these two variables or guiding me in how to understand this would be greatly appreciated!
If anyone has the time or needs more info, I am providing the link to the methodological guidelines here: https://circabc.europa.eu/sd/a/f8853...09.12.2020.pdf
Thank you in advance for your expertise and support!
I am currently undertaking a research project focusing on labor market participation and outcomes in Kosovo, with a specific emphasis on gender disparities. My objective is to measure labor market outcomes separately for men and women, investigate the gender pay gap, and explore whether parenthood is associated with either penalties (motherhood penalty) or premiums (fatherhood premium). Given that Kosovo implemented EU-SILC only since 2018, I plan to merge separate data files for each year (H, D, P, R files) into a unified dataset for each year (2018, 2019, 2020, 2021) and subsequently run regressions, including probit, mincer, and oaxaca analyses.
I am facing challenges related to demographic variables, particularly in determining the parental status of individuals. There is no direct variable indicating parental status, and I aim to generate this variable for all individuals, additionally noting those with children under 25 and 30. In addition to determining parental status, I am faced with the challenge of obtaining the number of children for each individual.
To provide context, I have a subsample with a 'dataex' of RB220 (father ID) and PB230 (mother ID), where PB030 represents the personal ID.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str6(RB220 RB230) " " " " "10001" "10002" " " " " "170001" "170002" " " " " "170003" "170005" "170003" "170005" " " " " "170001" "170002" " " " " " " " " " " " " " " " " "200001" "200002" " " " " "200003" " " "200003" " " "200003" "200004" " " " " " " " " " " " " " " " " "30003" "30004" "60003" "60004" "60001" " " " " " " " " " " "60001" " " "200003" "200004" "200003" "" "" "" "" "" "200001" "200002" "200003" "" "" "" "" "" "" "" "" "" "" "" "30003" "30004" "" "" "" "" "" "" "" "" end
- PB030: PERSONAL ID
- Topic: Technical items / Identification
- Variable Type: Annual
- Unit: All current household members aged 16 and over
- Reference Period: Constant
- Mode of Collection: Frame, register, or interviewer
- In Use Since: First year of EU-SILC data collection
- RB220: FATHER ID (equivalent to PB160)
- Topic: Person and household characteristics / Demography
- Variable Type: Annual
- Unit: All current household members (of any age)
- Reference Period: Current
- Mode of Collection: Derived
- In Use Since: First year of EU-SILC data collection
- Series’ Differences: From 2021 onwards, foster fathers are excluded
- RB230: MOTHER ID (equivalent to PB170)
- Topic: Person and household characteristics / Demography
- Variable Type: Annual
- Unit: All current household members (of any age)
- Reference Period: Current
- Mode of Collection: Derived
- In Use Since: First year of EU-SILC data collection
- Series’ Differences: From 2021 onwards, foster mothers are excluded
Your help with the syntax on creating these two variables or guiding me in how to understand this would be greatly appreciated!
If anyone has the time or needs more info, I am providing the link to the methodological guidelines here: https://circabc.europa.eu/sd/a/f8853...09.12.2020.pdf
Thank you in advance for your expertise and support!
Comment