Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects and conditional outcome

    Hi Statalisters,
    I read a paper about examining how assessor fixed effect affects the outcome. They conduct an experiment in which 20 assessors are randomly assigned to 2000 students. Students may refuse to answer. The first outcome is whether the student joins or not. Number of observations for the first outcome is 2000. The second outcome is how many questions each student answers. Number of the observations for the second outcome is 1890 ( 210 students refuse to join).

    Regressing the first outcome on 19 assessor dummies (one is omitted to avoid multicolinearity) is clear to me.
    But the second outcome is a conditional outcome (conditioning on the first outcome is yes), it may not represent all assessors equally. If most of the estimates of 19 dummies are significant, is it precise to claim that the assessors do have some effect on the second outcome? I wonder about potential selection bias here.

    Thanks in advance.

  • #2
    But the second outcome is a conditional outcome (conditioning on the first outcome is yes), it may not represent all assessors equally. If most of the estimates of 19 dummies are significant, is it precise to claim that the assessors do have some effect on the second outcome? I wonder about potential selection bias here.
    You are correct to worry about selection bias here. If you want to make a claim that generalizes to the universe of all assessors and students, you can't really do that with this study design because we don't know how to account for the non-participation of just over 10% of the students. Note, by the way, that the selection bias may refer to the students as well as to the assessors, or, even more likely in reality, some interaction of the assessors and students.

    If you are willing to assume that the non-participation of students is independent of the (unobservable) rating they would receive if they had participated, then those data points are missing completely at random and the simple analysis you described is unbiased. But that is unlikely to be true. One step removed from that, if you can believe that conditional on other observed variables, the missingness is independent of the (unobservable) rating they would receive if they had participated, then you have an MAR situation which can be remedied by using multiple imputation. Depending on what is being assessed and what other variables have been observed in the study, this may or may not be a plausible assumption. But if it isn't plausible, you are stuck with a non-response bias that you can neither estimate nor repair in this study design. You would have to re-run the study in a way that precludes student withdrawal.

    As an aside, in no case would I judge the effect of the assessors by examining how many of the 19 assessor indicator variables have statistically significant coefficients. If you are going to test the hypothesis that there is a non-zero assessor effect, that should be done by a joint significance test of all the assessor indicator variables. (Actually, I might use a different analysis altogether, but let's not go down that road at this point.)

    Comment


    • #3
      Thanks, Clyde.
      If we see a correlation between non-participation and ethnicity (from running non-participation on ethnicity and some other covariates), you would suggest to conditional on this covariate to be able to use the second (conditional) outcome?

      Another point related to non-participation I wonder is about checking the p-value of Rsquare. Their process includes randomization of assessors and calculating the Rsquared, total of about 10000 times for each outcome. They then calculate p-value. I wonder their randomization of assessors is not purely random. Only the first outcome is with full sample, others have non-participation, ranging from 10-20%. That means for other outcomes (such as the second outcome), it is not possible to randomly assigned an equal number of students to all 20 assessors.
      I am guessing maybe they conduct randomization of assessors in a way to make sure the number of observations of each assessor (for each outcome) remain unchanged. For example, for the second outcome, assessor 1 has 90 obs., assessor 2 has 86 obs., etc. Is my guessing correct? Otherwise, could you give any thought on this?
      Last edited by Daisy Dang; 27 Sep 2024, 20:11.

      Comment


      • #4
        If we see a correlation between non-participation and ethnicity (from running non-participation on ethnicity and some other covariates), you would suggest to conditional on this covariate to be able to use the second (conditional) outcome?
        No, that's not what I meant.

        The important issue is whether the missingness is independent of what the unobserved missing value of the outcome (number of questions answered) would have been if you had been able to observe this. There is no data-based way to know if this is true, because that would require knowing the unobserved values, which is, by definition, impossible. So it is something that you might make assumptions about based on external knowledge.

        For example, suppose that the students who didn't do the assessment didn't do that because they got stuck in some traffic jam or transit system failure and couldn't get to the assessment on time. Traffic jams and transit system failures don't distinguish people who would have answered more questions from those who would have answered fewer. So it would be reasonable to believe that the non-participation in this situation was truly independent of the unobserved missing values of the outcome. And in this case, you could just use the regression based on the available existing data and its results would not be biased.

        The ethnicity being correlated with missingness doesn't really tell us much one way or another. Now, if ethnicity (of the assessor? of the student?) is a really strong predictor of the number of questions that the student would have answered, so strong that there are no other factors other than random noise, including factors you haven't observed and didn't even know existed, that would affect the outcome, then you could use a procedure called multiple imputation to "fill in" what the unobserved outcomes from the non-participating students might have been, and do several regressions using these imputed values, and then combine the results of those regressions to come up with unbiased estimates of the assessor effects.

        But it strikes me as unlikely that ethnicity (of either the assessor or the student) would be a really strong predictor of what the outcomes of the non-participating students would have been. I suppose it's possible, but it's hard for me to imagine how that would actually work in the real world. But to give you a more positive example, let's change the problem a little. Suppose that all of the students each completed 10 assessments, but then 210 of them skipped the 11th assessment. Assuming that the assessments were of a similar degree of difficulty and content, I think it would be reasonable to think that knowing the student's outcome for all of those first 10 assessments you might be able to make good predictions about what the outcome of the 11th assessment might have been if the student had continued participating. So conditional on those first 10 assessment outcomes (and the ethnicity too if you think it is informative here), you could use multiple imputation, as described in the previous paragraph, to get unbiased estimates.

        So my point is that to make use of multiple imputation you want to have variables that you think are pretty good, collectively, at pinning down the values of the outcomes for those who stopped participating, so much so that the difference between those predicted values and the real ones is pretty purely random. Do you have any variables like that?

        Their process includes randomization of assessors and calculating the Rsquared, total of about 10000 times for each outcome. They then calculate p-value. Here, I wonder their randomization of assessors is not purely random.
        This description of what they did is too vague for me to know what they actually did. It sounds like it might be a permutation test, which is a reasonable thing to do. But I can't tell from what you have described whether it was really that, or that done incorrectly, or something else altogether.

        That means for other outcomes (such as the second outcome), it is not possible to randomly assigned an equal number of students to all 20 assessors.
        I am wondering maybe they conduct randomization of assessors in a way to make sure the number of observations of each assessor (for each outcome) remain unchanged. For example, for the second outcome, assessor 1 has 90 obs., assessor 2 has 86 obs., etc. Is this way correct? Otherwise, could you give any thought on this?
        You seem to be very concerned about equal numbers of students per assessor. Why? There are very few statistical analyses that might be done with this kind of data where that would matter at all. And most of those are either old procedures for which newer ones that handle unbalanced data exit, or somewhat obscure. The thing to worry about with missing data due to non-participation is bias, not whether it is equally distributed across assessors. It is true that the effect of an assessor with 90 observations will be ever so slightly more precisely estimated in the regression than that of an assessor with 86 observations--but the difference is tiny, and there are few situations where it would matter. The important issue is whether those estimates are biased as a result of the students' self-selection to continue participating or not. And that has nothing to do with whether non-participation is equally severe across the assessors.



        Comment


        • #5
          Many thanks, Clyde.

          Yes, they conduct permutation test.
          About shuffling 20 assessors among 1890 students for the second outcome, which way between 2 following ways you think correct:
          1, Evenly distributing teachers as much as possible. Some teachers will be assigned one more student than others.
          2, Shuffling the assessor assignments, ensuring that the number of students for assessors in each time of shuffling is similar to the original number of students for assessors in the second outcome.

          Comment


          • #6
            Some additional comments. Because you are conditioning on the assessor dummies, the selection may depend on assessor. In other words, the selection can be missing at random rather than MCAR. Second, I'm not sure imputing missing data on the outcome variable does much because if those same predictors used in the imputation would appear in the model then you effectively do nothing. In fact, if you use OLS imputation, you would literally get back the same estimates. Adding noise, as in MI, doesn't do anything useful except add noise.

            These selection issues can create conceptual problems. If you think there's a selection issue in this case then you must be interested in the following thought experiment: How many questions would a student answer whether or not the student joins (whatever "join" means here). I suppose that could be interesting, but I'm not sure that thinking of it in two parts is all that bad. Conditional on joining, how many questions does a student answer. In some way, if they refuse to join, haven't they basically said they're not answering any questions? Generally, plugging in zero for missing outcomes is a very bad idea. I'd need to know more what it means to "join" and the nature of "answering questions."

            Comment


            • #7
              Many thanks, Jeff.

              "Join" in this setting includes 2 cases. Student refused to answer (even though the assessor had met the student). The second case is the assessor was not able to meet the student (probably because the student had moved to a new address).
              There are 2 types of "Answering questions". The first one is a simple count of how many questions a student answered, irrespective of accuracy (some students refused to answer some questions). The second type is a count of how many correct questions the student provided.

              I would love to hear more from you on this point.

              Comment

              Working...
              X