Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeated Cross sectional regression commands

    Hello guys, I am working on repeated cross sectional data sets on women (health related) for 36 countries with years ranging from 1990 to 2018. The data are rounds of surveys with intervals ranging from 3 to 9. The number of rounds for each country ranges from 1 to 6. I am expected to create a mother-cohort-fixed effects with birth year and country of residence. Other suggestion is that I incorporate country and year fixed effects. My difficulty is how to do this in stata.

  • #2
    From your description, it is impossible to discern whether each observation in your data set is the aggregate results of all surveys for a given country in a given year, or whether you have individual respondent-level data with multiple observations (corresponding to multiple respondents) in each country-year combination. For that matter, it is impossible to tell whether your data are in long or wide layout, or some other arrangement. In short, as is nearly always the case with verbal descriptions, there is insufficient information about your data to go beyond vague, general advice that has little chance of being useful. People are requested to read the Forum FAQ before their first post so that they can benefit from the excellent advice there on how to ask questions in ways that enhance their probability of getting a timely and helpful response. In particular, FAQ #12 would have alerted you to the importance of showing example data using the -dataex- command.

    If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Thanks for your response Clyde. Specifically I am working with DHS data which is an individual-level data with multiple observations in each country-year combination. I have caesarean section, a dummy variable as the variable of focus and trying to look at how it is influenced by education, wealth and health insurance. I wanted to run an OLS with the most recent data from each country. However, it has been suggested that using repeated cross sections will be ideal hence the need to deal pseudo panel related estimations. So specifically, I am looking for the syntax that will help me create the cohort for each of the datasets I have for each country year before going ahead to append.

      I got this from one of the platforms and wanted to find out if it is the right syntax.

      clear webuse nlswork
      gen Byear= birth_yr recode Byear (41/43=43) (54=53)
      tab Byear
      tab race
      tab year
      ysort Byear race year: egen newincome= mean(ln_wage)
      bysort Byear race year: egen newgrade= mean( grade )
      bysort Byear race year: egen newwks= mean( wks_work )
      bysort Byear race year: egen newexp= mean(ttl_exp)
      sum ln_wage grade wks_work ttl_exp newincome newgrade newwks newexp
      egen Cohorts=group(Byear race) xtset Cohorts xtreg newincome newgrade newwks newexp,fe
      estimates store FE1
      xtset idcode
      xtreg ln_wage grade wks_work ttl_exp,fe
      estimates store FE2
      esttab FE1 FE2

      What follows relates to a country-year.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(v106 v025 v190) int b2_11
      0 2 1 .
      0 2 1 .
      0 2 1 .
      1 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 2 .
      1 2 2 .
      0 2 1 .
      0 2 1 .
      1 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 1 .
      0 2 1 .
      1 2 1 .
      1 1 2 .
      2 1 5 .
      0 1 3 .
      1 1 3 .
      1 1 2 .
      2 1 5 .
      2 1 3 .
      0 1 3 .
      1 1 4 .
      1 1 3 .
      0 1 4 .
      1 1 4 .
      0 1 3 .
      1 1 3 .
      1 1 3 .
      1 1 4 .
      1 1 4 .
      0 1 4 .
      2 1 4 .
      2 1 3 .
      2 1 4 .
      1 1 4 .
      1 1 2 .
      0 1 3 .
      1 1 3 .
      1 1 4 .
      2 1 5 .
      3 1 5 .
      2 1 5 .
      3 1 5 .
      0 1 3 .
      2 1 4 .
      1 1 4 .
      1 1 3 .
      0 1 5 .
      2 1 5 .
      1 1 4 .
      2 1 4 .
      1 1 4 .
      1 1 4 .
      2 1 4 .
      2 1 4 .
      1 1 4 .
      2 1 3 .
      0 1 3 .
      2 1 4 .
      2 1 5 .
      3 1 5 .
      2 1 5 .
      1 1 5 .
      1 1 3 .
      2 1 3 .
      2 1 4 .
      1 1 3 .
      0 2 1 .
      1 2 2 .
      1 2 2 .
      0 2 1 .
      0 2 2 .
      0 2 3 .
      0 2 2 .
      0 2 3 .
      0 2 2 .
      2 2 2 .
      0 2 1 .
      0 2 2 .
      2 2 3 .
      0 2 1 .
      1 2 1 .
      1 2 3 .
      1 2 3 .
      0 2 3 .
      0 2 2 .
      0 2 2 .
      1 2 2 .
      1 2 2 .
      2 2 2 .
      0 2 3 .
      0 2 3 .
      0 2 1 .
      1 2 1 .
      end
      label values v106 V106
      label def V106 0 "no education", modify
      label def V106 1 "primary", modify
      label def V106 2 "secondary", modify
      label def V106 3 "higher", modify
      label values v025 V025
      label def V025 1 "urban", modify
      label def V025 2 "rural", modify
      label values v190 V190
      label def V190 1 "poorest", modify
      label def V190 2 "poorer", modify
      label def V190 3 "middle", modify
      label def V190 4 "richer", modify
      label def V190 5 "richest", modify
      Last edited by Francis Lawer; 10 Dec 2021, 20:03.

      Comment


      • #4
        The code you show seems an appropriate model to follow. One small point: the line that begins -egen Cohorts = ...- has three commands on it. That's not legal in Stata. Each command will need to start on a new line.

        Comment


        • #5
          On this same data (demographic and health survey data involving 36 Sub-Saharan African countries for the period), I read a piece about someone using birth history of women and children to form a panel of mothers. I want to find out if DHS data involving 110 country-year rounds can be converted into a panel using the information on birth history since it is a repeated cross-sectional dataset. This question has a link with the initial question.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte(caesarean_delivery part_educat_years covered_health_insurance)
          . 11 .
          0  0 .
          0  6 .
          .  . .
          .  6 .
          . 15 .
          0  7 .
          0  . .
          0  4 .
          0  6 .
          0  5 .
          .  7 .
          0  5 .
          .  5 .
          .  . .
          0  0 .
          0  0 .
          1 12 .
          . 11 .
          0  . .
          . 11 .
          0  0 .
          .  0 .
          0  5 .
          . 11 .
          .  9 .
          0  0 .
          .  . .
          0  0 .
          0  . .
          0 11 .
          0  0 .
          0  . .
          . 12 .
          .  . .
          . 12 .
          1  9 .
          0  0 .
          0  0 .
          0 11 .
          0 10 .
          .  4 .
          .  2 0
          0  0 .
          . 10 .
          0  . .
          .  . .
          .  9 .
          .  . .
          0 12 .
          .  0 .
          0  0 .
          . 17 .
          0  0 .
          0  2 0
          1  0 .
          0  . .
          0  0 .
          0  0 .
          0 12 .
          .  . 0
          . 15 0
          0  . .
          0  0 1
          0  0 0
          0 16 0
          .  0 0
          0  0 0
          .  7 0
          0 12 .
          .  2 0
          . 16 1
          .  . 0
          . 11 0
          0 11 0
          .  . 0
          0  5 .
          0  3 .
          .  . 0
          0 12 0
          0  0 0
          0  4 0
          0  . 0
          .  0 .
          0  0 0
          0  5 1
          0 12 0
          . 12 1
          .  0 0
          .  6 0
          0  7 1
          .  . 0
          0  6 0
          0 10 1
          0  . .
          . 12 0
          .  6 0
          0 12 0
          .  6 .
          0 11 0
          end
          label values caesarean_delivery m17_1
          label def m17_1 0 "no", modify
          label def m17_1 1 "yes", modify
          label values part_educat_years v715
          label values covered_health_insurance LABK
          label def LABK 0 "no", modify
          label def LABK 1 "yes", modify

          Comment


          • #6
            I'm not familiar with the DHS, other than noticing that many people who ask questions on Statalist work with it. But if, as you say, it has a cross-sectional design, there is no possibility of extracting a panel from it. By design, different people will be sampled at each wave, and while there will be a few people who, by chance, end up being sampled more than once, that subset of people, even if you could identify them, would be too small to be useful.

            Comment

            Working...
            X