Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • School Fixed Effects on Tennessee Student/Teacher Achievement Ratio experiment (Project Star)

    Hello guys,

    I have the data from the Tennessee Student/Teacher Achievement Ratio experiment (Project Star). Over all four years, the sample included 11,600 students from 80 schools and my current variables are:

    gkclasstype: Indicator for the type of class the student was in in kindergarten (you can assume that the student was assigned to this class type in the first year they entered Project STAR), taking value 1 if the student was in the small class, 2 if they were in the regular class and 3 if they were in the regular class with teacher’s aide (g1classtype, g2classtype and g3classtype are defined similarly for grades 1, 2 and 3 respectively)

    gkclasstype_d1: Dummy variable equal to one if the student was in the small class in kindergarten and equal to zero if they were in any other class type (i.e. in the regular or regular aide class) (and similarly for other grades)

    gkclasstype_d2 Dummy variable equal to one if the student was in the regular class in kindergarten and equal to zero if they were in any other class type (i.e. in the small or regular aide class) (and similarly for other grades)

    gkclasstype_d3 Dummy variable equal to one if the student was in the regular aide class in kindergarten and zero if they were in any other class type (i.e. in the small or regular class) (and similarly for other grades)

    yearsstar: Number of years the student attended a school participating in Project STAR between kindergarten and grade 3

    yearssmall: Number of years the student attended a small class between kindergarten and grade 3

    gkstdmath: Standardised average maths test scores achieved at the end of kindergarten (derived by subtracting the mean and dividing by the standard deviation of maths test scores amongst all kindergarten students in the sample) (g1stdmath, g2stdmath and g3stdmath are defined similarly for grades 1, 2 and 3)

    1) I am trying to estimate the effect of being assigned to: i) a small class vs a regular class, on standardised kindergarten maths scores. Most research papers have stated that the random assignment occurred within schools and incorporate school fixed effects. I am trying to add school fixed effects to my model and schools had to have at least one class of each size, how do I manage to do that? (Should I use LSDV but there are 79 school dummy variables)

    2) What if randomisation had occurred across schools and schools did not have to have at least one of each class size? What would be the approach?

    Your help is much appreciated. Thank you in advance.

    Regards,
    Jack
    Last edited by Jack Lee; 10 Mar 2016, 16:12.

  • #2
    Jack Lee I would go with a mixed effects approach here. There are fairly big differences between - and unfortunately sometimes within - school districts and the schools therein. However, it isn't clear why you would be regressing the baseline on covariates unless you are trying to test whether the randomization was truly random. One of the major difficulties with this study is compliance effects (e.g., after the first year parents could move their children into other classrooms). If you have data for each year of the study, you may want to include a fixed effect for switching to different treatment/control groups which you could instrument with the original random assignment. Also, you could run into issues with the math scores if you are standardizing them (e.g., an average student who remains average has 0 change, but the meaning of average across grades may not be comparable).

    Comment


    • #3
      wbuchanan
      ​thanks for the insight! Yes I am trying to test whether the randomisation of my data was truly random and it's used as the preliminary analysis of my data. You mentioned mixed effects approach, but how would I use the approach to generate a dummy variable for each school to absorb the school effects?

      Krueger 1999 did and capture the clustering of classes within schools by including the dummy variable for each school to absorb the school effect and I am unsure how he did that.

      Any help would be much appreciated!
      Last edited by Jack Lee; 11 Mar 2016, 05:31.

      Comment


      • #4
        Jack Lee You wouldn't use school fixed effects. Instead you would have student level fixed effects and school level random effects. You could still cluster the standard errors by districts if you wanted, but this would allow you estimate and test both the within and between school effects. If you want to go with school fixed effects, an easy way would be to use numeric school IDs and then use i.school_id in the model which would treat each of the values as a distinct group (e.g., creates the indicator variables for you automatically at runtime) and gives you way more flexibility if you wanted to estimate the marginal effects/contrasts after fitting your model. Most of the earlier studies excluded children who changed classrooms over the course of the study (at least one that my wife was forced to read for a course once), but I've never come across anything that questioned the original assignment mechanism itself.

        Comment


        • #5
          It's standard to use school fixed effects in analyzing Project STAR. Student fixed effects won't work because nearly all students were in the same treatment group, by design, for all 4 years of the experiment. You can use xtset to define the school fixed effects and then run xtreg...,fe . You don't need to generate school dummies explicitly.

          If you're using multiple years of data in one model run, then you can cluster the data by student to account for serial correlation. (In theory you could account for serial correlation with the xtregar command, but in practice that command will drop a year of data.)

          Comment


          • #6
            paulvonhippel There were a non-trivial number of non-compliers in the experiment since parents were allowed to have their children transferred into other classrooms after the initial year. Since the original post implied that only a single "cross section" of sorts would be used, student-level fixed effects would be appropriate and would be a fairly standard practice. There would be no way to account for serial correlation if the OP only looked at what would amount to a single cross section either. If there were going to be any form of longitudinal analyses, then I agree that there would be no student "fixed" effects other than the time indicator and the student level effects could be treated as random effects at a student level which would subsequently be nested within school level and/or district level random effects. Student level autocorrelation could be important to consider in a longitudinal context, but I would still suggest that using an autoregressive error term in a mixed effects model would yield better estimates since it would also adjust for the lack of independence in the error associated with the clustering of students in schools/districts.

            Comment


            • #7
              We may be using different terminology. When I say "student fixed effects" I'm referring to the situation where you have a dummy variable for every student.

              In a cross-section, you definitely can't use that kind of student fixed effects because there is only observation per student.

              You may be using the term "student fixed effects" in a different way.

              (We may not need to pursue this further if the OP has left.)

              Comment


              • #8
                paulvonhippel it seems like it is a terminology difference. I was using fixed effects in the context of mixed effects/multilevel/hierarchical models not as a reference to saturating a single level model with fixed effects from the nested entities. In either case, thanks for the clarification.

                Comment

                Working...
                X