Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Binary variable and countinous variable to calculate the effect of covid on students achievments

    Hi guys,

    I am using the "mixed" command to regress the effect of COVID-19 on students' achievements. I am trying to use two variables to represent the COVID, one is binary (Post_COVID: 0 for Before the Epidemic, 1 for During the Epidemic), and another is continuous (SchoolClosureDays: means the days of Covid-induced School Closure). Binary Variable: Represents a dichotomous outcome. It indicates whether students were affected by school closure (at least one day) versus not at all, which simplifies the complexity of the situation. Continuous Variable: Represents a nuanced view of the relationship. It captures the impact of incremental changes (each additional day of closure) on student achievement, allowing for a detailed understanding of how longer closures progressively affect outcomes. Can I include both the continuous variable and the binary variable in the same model? I am afraid there may be an overlap in the effects, meaning they might capture similar aspects of the pandemic’s impact. Is there any good way to check if the effects of the two variables are independent? Or any idea about if it is reasonable to include both in the model? Thank you~

    This is the command I use: mixed Achievements Time SchoolClosureDays i.Post_COVID##i.Gender i.Post_COVID##i.IMMIG i.Post_COVID##i.SES c.SchoolClosureDays#i.Gender c.SchoolClosureDays#i.IMMIG c.SchoolClosureDays#i.SES i.Gender#c.Time i.IMMIG#c.Time i.SES#c.Time|| CNTRYID:Time, covariance(unstructured) nolog vce(robust)

    Gender(binary,0=Girls), IMMIG(immigration backgrounds, 0=native, 1=second-genration, 2=first-generation), SES(socia-economic background, 0=low SES, 1=medium SES, 2=high SES), Time(continuous, 2003-2023), CNTRYID=country identifier.

    Yin


  • #2
    It is clear that the dichotomous and continuous variables you speak of cannot possibly be independent. If the dichotomous variable is zero then the continuous one must also be. QED. So, yes, the two will share some variance and it will be difficult to interpret the results if you use both.

    In my mind, neither specification seems plausible. While I won't say that a single day of school closure cannot possibly have an effect on school achiements, I think it is not plausible that such an event could be detected with the kind of measures that are available in a dataset that is small enough to be feasible to obtain. After all, one day school closures are not uncommon in any country in any year due to extreme weather events or threats to public safety. And such things continued to happen during the pandemic. On the other hand, the idea that the impact on achievements is linear in the number of days also seems implausible. While there would be some initial effect from closures, at some point I would expect to see diminishing "returns" from longer closures so that the overall relationship between achievement deficits and duration of closures would be non-linear (sub-linear). In other words, I don't think an 8 month closure would be twice as deleterious as a 4 month closure. Now, I'm not an expert in childhood education, so my intuitions about this could be wrong. But if I were working on this project, I would do some graphical exploration of the data to get a sense of the shape of the deficit:duration curve and then choose a specification that is capable of reflecting that.

    As an aside, you have a really large number of interaction terms in your proposed model. Is your data set large enough to withstand that many model degrees of freedom without massively over-fitting noise in the data?

    Comment


    • #3
      Thank you for your advice, Clyde. It is very helpful for me👍👍👍.

      Comment

      Working...
      X