Repeated Cross sectional regression commands

Francis Lawer

Join Date: Dec 2021

Posts: 5
#1

Repeated Cross sectional regression commands

10 Dec 2021, 14:33

Hello guys, I am working on repeated cross sectional data sets on women (health related) for 36 countries with years ranging from 1990 to 2018. The data are rounds of surveys with intervals ranging from 3 to 9. The number of rounds for each country ranges from 1 to 6. I am expected to create a mother-cohort-fixed effects with birth year and country of residence. Other suggestion is that I incorporate country and year fixed effects. My difficulty is how to do this in stata.
Tags: fixed effects, logit, regression, Suggestion, syntax
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

10 Dec 2021, 14:41

From your description, it is impossible to discern whether each observation in your data set is the aggregate results of all surveys for a given country in a given year, or whether you have individual respondent-level data with multiple observations (corresponding to multiple respondents) in each country-year combination. For that matter, it is impossible to tell whether your data are in long or wide layout, or some other arrangement. In short, as is nearly always the case with verbal descriptions, there is insufficient information about your data to go beyond vague, general advice that has little chance of being useful. People are requested to read the Forum FAQ before their first post so that they can benefit from the excellent advice there on how to ask questions in ways that enhance their probability of getting a timely and helpful response. In particular, FAQ #12 would have alerted you to the importance of showing example data using the -dataex- command.

If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Francis Lawer

Join Date: Dec 2021

Posts: 5
#3

10 Dec 2021, 19:57

Thanks for your response Clyde. Specifically I am working with DHS data which is an individual-level data with multiple observations in each country-year combination. I have caesarean section, a dummy variable as the variable of focus and trying to look at how it is influenced by education, wealth and health insurance. I wanted to run an OLS with the most recent data from each country. However, it has been suggested that using repeated cross sections will be ideal hence the need to deal pseudo panel related estimations. So specifically, I am looking for the syntax that will help me create the cohort for each of the datasets I have for each country year before going ahead to append.

I got this from one of the platforms and wanted to find out if it is the right syntax.

clear webuse nlswork
gen Byear= birth_yr recode Byear (41/43=43) (54=53)
tab Byear
tab race
tab year
ysort Byear race year: egen newincome= mean(ln_wage)
bysort Byear race year: egen newgrade= mean( grade )
bysort Byear race year: egen newwks= mean( wks_work )
bysort Byear race year: egen newexp= mean(ttl_exp)
sum ln_wage grade wks_work ttl_exp newincome newgrade newwks newexp
egen Cohorts=group(Byear race) xtset Cohorts xtreg newincome newgrade newwks newexp,fe
estimates store FE1
xtset idcode
xtreg ln_wage grade wks_work ttl_exp,fe
estimates store FE2
esttab FE1 FE2

What follows relates to a country-year.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte(v106 v025 v190) int b2_11 0 2 1 . 0 2 1 . 0 2 1 . 1 2 1 . 0 2 1 . 0 2 1 . 0 2 1 . 0 2 1 . 0 2 1 . 0 2 2 . 1 2 2 . 0 2 1 . 0 2 1 . 1 2 1 . 0 2 1 . 0 2 1 . 0 2 1 . 0 2 1 . 1 2 1 . 1 1 2 . 2 1 5 . 0 1 3 . 1 1 3 . 1 1 2 . 2 1 5 . 2 1 3 . 0 1 3 . 1 1 4 . 1 1 3 . 0 1 4 . 1 1 4 . 0 1 3 . 1 1 3 . 1 1 3 . 1 1 4 . 1 1 4 . 0 1 4 . 2 1 4 . 2 1 3 . 2 1 4 . 1 1 4 . 1 1 2 . 0 1 3 . 1 1 3 . 1 1 4 . 2 1 5 . 3 1 5 . 2 1 5 . 3 1 5 . 0 1 3 . 2 1 4 . 1 1 4 . 1 1 3 . 0 1 5 . 2 1 5 . 1 1 4 . 2 1 4 . 1 1 4 . 1 1 4 . 2 1 4 . 2 1 4 . 1 1 4 . 2 1 3 . 0 1 3 . 2 1 4 . 2 1 5 . 3 1 5 . 2 1 5 . 1 1 5 . 1 1 3 . 2 1 3 . 2 1 4 . 1 1 3 . 0 2 1 . 1 2 2 . 1 2 2 . 0 2 1 . 0 2 2 . 0 2 3 . 0 2 2 . 0 2 3 . 0 2 2 . 2 2 2 . 0 2 1 . 0 2 2 . 2 2 3 . 0 2 1 . 1 2 1 . 1 2 3 . 1 2 3 . 0 2 3 . 0 2 2 . 0 2 2 . 1 2 2 . 1 2 2 . 2 2 2 . 0 2 3 . 0 2 3 . 0 2 1 . 1 2 1 . end label values v106 V106 label def V106 0 "no education", modify label def V106 1 "primary", modify label def V106 2 "secondary", modify label def V106 3 "higher", modify label values v025 V025 label def V025 1 "urban", modify label def V025 2 "rural", modify label values v190 V190 label def V190 1 "poorest", modify label def V190 2 "poorer", modify label def V190 3 "middle", modify label def V190 4 "richer", modify label def V190 5 "richest", modify

Last edited by Francis Lawer; 10 Dec 2021, 20:03.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#4

11 Dec 2021, 11:44

The code you show seems an appropriate model to follow. One small point: the line that begins -egen Cohorts = ...- has three commands on it. That's not legal in Stata. Each command will need to start on a new line.
Comment

Francis Lawer

Join Date: Dec 2021
Posts: 5

01 Mar 2022, 02:12

On this same data (demographic and health survey data involving 36 Sub-Saharan African countries for the period), I read a piece about someone using birth history of women and children to form a panel of mothers. I want to find out if DHS data involving 110 country-year rounds can be converted into a panel using the information on birth history since it is a repeated cross-sectional dataset. This question has a link with the initial question.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(caesarean_delivery part_educat_years covered_health_insurance)
. 11 .
0  0 .
0  6 .
.  . .
.  6 .
. 15 .
0  7 .
0  . .
0  4 .
0  6 .
0  5 .
.  7 .
0  5 .
.  5 .
.  . .
0  0 .
0  0 .
1 12 .
. 11 .
0  . .
. 11 .
0  0 .
.  0 .
0  5 .
. 11 .
.  9 .
0  0 .
.  . .
0  0 .
0  . .
0 11 .
0  0 .
0  . .
. 12 .
.  . .
. 12 .
1  9 .
0  0 .
0  0 .
0 11 .
0 10 .
.  4 .
.  2 0
0  0 .
. 10 .
0  . .
.  . .
.  9 .
.  . .
0 12 .
.  0 .
0  0 .
. 17 .
0  0 .
0  2 0
1  0 .
0  . .
0  0 .
0  0 .
0 12 .
.  . 0
. 15 0
0  . .
0  0 1
0  0 0
0 16 0
.  0 0
0  0 0
.  7 0
0 12 .
.  2 0
. 16 1
.  . 0
. 11 0
0 11 0
.  . 0
0  5 .
0  3 .
.  . 0
0 12 0
0  0 0
0  4 0
0  . 0
.  0 .
0  0 0
0  5 1
0 12 0
. 12 1
.  0 0
.  6 0
0  7 1
.  . 0
0  6 0
0 10 1
0  . .
. 12 0
.  6 0
0 12 0
.  6 .
0 11 0
end
label values caesarean_delivery m17_1
label def m17_1 0 "no", modify
label def m17_1 1 "yes", modify
label values part_educat_years v715
label values covered_health_insurance LABK
label def LABK 0 "no", modify
label def LABK 1 "yes", modify

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#6

01 Mar 2022, 10:54

I'm not familiar with the DHS, other than noticing that many people who ask questions on Statalist work with it. But if, as you say, it has a cross-sectional design, there is no possibility of extracting a panel from it. By design, different people will be sampled at each wave, and while there will be a few people who, by chance, end up being sampled more than once, that subset of people, even if you could identify them, would be too small to be useful.
Comment

Announcement

Repeated Cross sectional regression commands

Comment

Comment

Comment

Comment

Comment