Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I proceed with this estimate?

    How to solve the cost of this estimate?

    I am conducting an analysis on infant mortality due to pollution. For this purpose I have considered microdata of births whose outcome equal to 1 corresponds to mortality (before 24 hours after birth or late fetal).

    My instrumental variable is the accumulation or average of a particular type of pollutant (PM2.5 or PM10). I have also considered for my regression both meteorological controls and characteristics of the mother.

    I am conducting this analysis at the municipal district level (it should be noted that within my sample there are about 8,000 municipalities) each with its measurement of accumulated pollution also at the weeks of gestation of the fetus or newborn child, until the date of death.

    My empirical strategy is defined by this equation:
    set maxvar 30000
    set min_memory 8g
    set niceness 0
    set matsize 10000
    use "C:\ ...", clear
    set dp comma
    describe

    xi: reg birthclassification accpm10 sexofbirth multiplebirth studylevelsmother yearsmother yearsmother2 foreignnationality monthlyrainfall monthlyt2 monthlyraint2 monthlyrainfall2 monthlyt22 i.monthbirth*i.year i.monthbirth*i.capm

    birthclassification corresponds to mortality classification
    accpm10 corresponds to accumulated PM10 pollutant
    sexofbirth : control for the sex of the newborn, dummy equal to 1 if the child is female
    multiplebirth : dummy 1 if the birth is multiple
    studylevelsmother : dummy equal to 1 if the mother has higher education
    yearsmother : age of the mother
    yearsmother2 : age of the mother at birth square
    foreignnationality : dummy equal to 1 if mother is native to the corresponding country
    monthlyrainfall: monthly average of accumulated rainfall
    monthlyt2: monthly average of accumulated temperatures
    monthlyraint2 : interaction between average rainfall and temperature
    monthlyrainfall2 : square of average rainfall
    monthlyt22 : square of average temperatures


    Finally, there are two interactions that I have included in my regression. One is monthbirth x year, expressed by "i.monthbirth*i.year" and from which I hope to measure seasonal patterns and unobserved heterogeneities. And also another interaction that corresponds to minutiae with months "i.monthbirth*i.capm" to evaluate seasonal patterns that differ in each municipality on a monthly basis.

    My problem arises in the calculation of this estimate when doing the regression. In total there are 2,623,692 observations (corresponds a study period between 2009-2016), and it implies an enormous cost in the estimate, even in days. I would like to ask if the procedure I am carrying out could have some obvious error; if the equation I have created could be defective? On the other hand, I assume that the interaction of the municipalities with the months is what is generating an enormous computational cost for me. If so, is there any way to remedy the cost of the operation? Please, I look forward to your comments in order to be able to solve this obstacle.

    Thank you

    JC
    Last edited by Juan Gutierrez; 23 Oct 2024, 06:05.

  • #2
    Juan:
    welcome to this forum.
    Some comments on your query:
    1) since you have a N>T panel dataset, why not using panel data command, such as -xtreg-, instead of -regress- to analyze your dataset?
    2) the -xi.- prefix is redundant if you use -fvvarlist- notation;
    3) you mention instrumental variable in your post: do you mean that you detected endogeneity in your regression? Or do you mean instrumental variable=independent variable?
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Good evening Carlo, thank you very much for the welcome to the forum and for your feedback.

      Yes, I will do the evaluation with xtreg. I think that the fact of using xi is what is causing the extension of time in the estimation.

      You are right, in the rush to write the message I wrote instrumental, but in reality it is an independent variable.

      Kind regards

      Juan Carlos

      Comment

      Working...
      X