Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pattern in residuals of OLS - how much does this matter?

    Hello, I am trying to analyze data from a survey of 77 patients with vitiligo.

    I have compared baseline characteristics using Fisher's Exact test and the Kruskall-Wallis test and found that dlqi is higher for patients who have not undergone depigmentation therapy (depigmented=0). I wanted to use OLS to see if the relationship between "dlqi" (continuous) and "depigmented" (binary) persisted when controlling for sex, race, and gender. Data are below:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float dlqi byte(depigmented raceconsol gender age percent)
     7 1 2 2 5 3
     5 0 2 2 5 3
     7 0 3 1 5 3
     0 0 1 2 4 3
     1 0 1 2 5 5
     1 0 1 2 6 2
     2 0 1 2 4 2
     0 0 1 1 5 5
     3 0 1 2 6 3
     3 0 1 2 4 4
     0 1 1 2 7 4
     0 1 1 2 6 5
     0 1 1 2 7 4
     0 1 1 2 4 4
     0 1 1 2 4 3
     0 1 1 2 4 4
     0 1 1 2 5 3
     0 1 1 2 7 5
     5 0 1 2 3 3
     5 0 1 2 7 4
     5 0 1 2 7 4
     5 0 1 2 3 1
     5 0 1 2 6 6
     0 1 3 2 6 3
     0 1 3 2 5 4
     0 1 3 2 2 4
     0 1 3 2 4 3
     3 0 1 1 3 1
     2 1 1 2 7 4
     2 1 1 2 4 4
     2 1 1 2 6 4
     0 1 1 1 4 5
     0 1 1 1 4 5
     0 1 1 1 6 4
     3 1 1 2 7 3
     3 1 1 2 4 4
    17 0 3 2 7 5
     2 1 3 2 1 1
     8 0 1 2 6 4
     8 0 1 2 7 4
    17 0 2 2 5 4
     3 1 3 2 1 1
     3 1 3 2 4 3
     5 1 1 2 6 5
    18 0 2 2 5 4
     6 1 1 2 6 4
     3 1 1 1 5 3
     3 1 1 1 6 4
     3 1 1 1 6 3
    20 0 3 2 6 4
     4 1 1 1 5 2
     8 1 1 2 5 4
     8 1 1 2 6 3
    13 0 1 2 6 3
    13 0 1 2 6 3
     9 1 1 2 5 4
    26 1 2 2 6 4
     8 1 1 1 5 2
    12 1 1 2 4 4
    11 1 3 2 4 4
    11 1 3 2 7 2
    17 0 1 2 4 3
    17 0 1 2 6 4
    17 0 1 2 5 4
    13 1 1 2 6 4
    13 1 1 2 5 4
    23 0 3 1 5 3
    29 1 2 2 5 3
    19 0 1 2 3 3
    19 0 1 2 6 3
    30 0 2 2 3 1
    25 0 1 2 6 3
     0 1 1 . 6 4
     1 0 3 . 5 3
    13 0 1 . 7 6
     1 0 1 . 7 4
     1 0 1 . 7 4
    end
    label values depigmented depigmented
    label def depigmented 0 "No", modify
    label def depigmented 1 "Yes", modify
    label values gender gender_
    label def gender_ 1 "Male", modify
    label def gender_ 2 "Female", modify
    label values age age_
    label def age_ 1 "< 20", modify
    label def age_ 2 "21-29", modify
    label def age_ 3 "30-39", modify
    label def age_ 4 "40-49", modify
    label def age_ 5 "50-59", modify
    label def age_ 6 "60-69", modify
    label def age_ 7 "70+", modify
    label values percent percent_
    label def percent_ 1 "< 5%", modify
    label def percent_ 2 "5-10%", modify
    label def percent_ 3 "11-30%", modify
    label def percent_ 4 "51-70%", modify
    label def percent_ 5 "71-90%", modify
    label def percent_ 6 "> 90%", modify
    I have run a few variations of this model using the "regress" command, and it seems like the best fitting model includes an interaction between "raceconsol" and "depigmented", which makes theoretical sense.

    -regress dlqi depigmented##raceconsol gender-

    Problem:

    Regardless of the variables included, the residuals show a clear pattern, and regression diagnostics (estat hettest, linktest, etc) tell me that I have problematic heteroskedasticity, misspecification, etc. All of the transformations I've done of dlqi (squaring, square root, log/ln transformations) result in worse model fit. I've also tried including the categorical variables age and percent, but these have levels with very few observations, so i dropped them due to concern for overfitting given the small sample size; they don't improve model fit anyway. I've looked at the outlying observations, and none are obviously flawed, so I can't justify dropping them.

    Should I just use robust standard errors and call it a day? Should I bother modeling this data at all? Would something like SEM (which I don't understand at all, to be honest) be more appropriate?

    Thank you!

    -Ashley

  • #2
    I wouldn't call dlqi continuous as it looks thoroughly discrete to me. I assume zero is the lowest possible value but is there a maximum in principle?

    It seems to me that you need something more like Poisson regression to match a non-negative response and some hints of heteroscedasticity.

    Robust standard errors can't fix an imperfect model specification; they just are often more honest than the default. But the more important question is whether Xb is the right functional form.

    Comment


    • #3
      Should I just use robust standard errors and call it a day?
      I would.

      Added: Crossed with #2. Nick makes a good point. -dotplot dlqi, over(depigmented)- does show the distribution of dlqi to be pretty skew, and using Poisson looks like it will be a better model for this data.
      Last edited by Clyde Schechter; 09 Jan 2022, 12:57.

      Comment


      • #4
        Thank you both for your responses! DLQI is a 10-question questionnaire scored 0-30; it measures the degree to which a dermatologic disease affects quality of life, so not exactly a count variable.

        Based on your advice, I tried modeling the data with a poisson regression using both -poisson- and -glm- using the -family(poisson)- option.

        -poisson dlqi depigmented##raceconsol gender-
        Chi2 = 230.45
        Pseudo-r2 of 0.28
        Estat gof significant (Chi2 = 374) - no change in results using -robust- or -difficult. (Nbreg gives me a pseudo r^2 of 0.05, so this doesn't seem to be an improvement.)

        -glm dlqi depigmented##raceconsol gender, family(poisson)-
        Log likelihood = -291.2349076
        AIC 8.284303
        BIC 96.18354

        Then I ran:
        -linktest, family(poisson)- which gave me a hatsq of 0.668.

        This is the only promising postestimation result I've gotten so far... Having some trouble figuring out what postestimation commands to run next.

        Before I get too far into the weeds, does it sound like I'm on the right track? I do feel like -regress- with -robust-, if not too problematic, is much easier for me/others to interpret, but if it's totally inappropriate for my data then I won't use it.

        Thanks so much!

        -Ashley

        Comment


        • #5
          The information that there is an upper bound changes my recommendation. The upper bound is attained, which is important. I would start with

          .
          Code:
           gen dlqi_scaled = dlqi/30 
          
          . glm dlqi_scaled depigmented##raceconsol gender, link(logit) f(binomial) vce(robust)
          There is your personal context of what you're expected to know about, but I guess most researchers in this territory would regard OLS regress as a crude model contradicted by what you know about the outcome variable.

          Comment


          • #6
            Originally posted by Ashley Riddle View Post
            I wanted to use OLS to see if the relationship between "dlqi" (continuous) and "depigmented" (binary) persisted when controlling for sex, race, and gender. . . . I have run a few variations of this model using the "regress" command, and it seems like the best fitting model includes an interaction between "raceconsol" and "depigmented", which makes theoretical sense.
            To control for a covariate doesn't imply that an interaction term is required. And with only 72 complete cases, it's probably questionable to (1) include so many predictors in the model (from the expansion of the categorical-by-categorical interaction terms) and (2) go fishing for best fit. I recommend just sticking with your original research question, that is, does the association between dlqi and depigmented persist when adjusting for sex, race and age group (I think you meant this latter and not gender).

            Regardless of the variables included, the residuals show a clear pattern, and regression diagnostics (estat hettest, linktest, etc) tell me that I have problematic heteroskedasticity, misspecification, etc. . . .Should I just use robust standard errors and call it a day?
            Looking at the fitted linear regression model (see below) with the predictor of main interest (prior depigmentation treatment) and adjustment for the three other predictors as covariates, it seems that you're okay with keeping things simple. You mention that the outcome variable is the score on a ten-item quality-of-life (QOL) questionnaire. These QOL sumscores are often suited to straightforward linear modeling. You have a lot of zeroes, but looking at the residuals (see the residuals-versus-fitted, Q-N and P-N plots below), there's nothing outrageous, perhaps one moderate outlier in the lower right-hand section of the residuals-versus-fitted plot that might warrant looking further into, but that's about it. The test-based diagnostics that you've done might be more alarming than constructively informative, but if you're still worried, then you can try fitting the model to the ranks (-help egen-) or using a permutation approach (-help permute-). You can go to a generalized linear model like Nick and Clyde suggest above, but I would be surprised if it leads to a radically different answer to your research question.

            And I'd say that your research question is answered in the affirmative. (Begin at the "Begin here" comment in the output below. Above that in the output, I've renamed the variables for brevity and included a third category for sex in order to avoid omitting missing-value observations: in light of the relatively small sample size, I felt this latter tack is warranted, but if you or your audience cannot countenance it, then just omit those observations.)

            .ÿ
            .ÿversionÿ17.0

            .ÿ
            .ÿclearÿ*

            .ÿ
            .ÿseedem
            setÿseedÿ1376942426

            .ÿ
            .ÿquietlyÿinputÿfloatÿdlqiÿbyte(depigmentedÿraceconsolÿgenderÿageÿpercent)

            .ÿ
            .ÿcompress
            ÿÿvariableÿdlqiÿwasÿfloatÿnowÿbyte
            ÿÿ(231ÿbytesÿsaved)

            .ÿ
            .ÿquietlyÿ{

            .ÿ
            .ÿrenameÿdlqiÿsco

            .ÿlabelÿvariableÿscoÿ"ÿScoreÿonÿDermatologyÿLifeÿQualityÿIndexÿquestionnaire"

            .ÿ
            .ÿrenameÿraceconsolÿrac

            .ÿlabelÿvariableÿracÿ"ConsolidatedÿRaceÿCategories"

            .ÿ
            .ÿgenerateÿbyteÿdpgÿ=ÿdepigmentedÿ==ÿ"Yes":depigmentedÿifÿ!missing(depigmented)

            .ÿlabelÿvariableÿdpgÿ"PriorÿDepigmentationÿTreatment"

            .ÿlabelÿdefineÿNYÿ0ÿNÿ1ÿY

            .ÿlabelÿvaluesÿdpgÿNY

            .ÿ
            .ÿgenerateÿbyteÿsexÿ=ÿgenderÿ==ÿ"Male":gender_ÿifÿ!missing(gender)
            (5ÿmissingÿvaluesÿgenerated)

            .ÿsummarizeÿsex,ÿmeanonly

            .ÿreplaceÿsexÿ=ÿr(max)ÿ+ÿ1ÿifÿmissing(sex)
            (5ÿrealÿchangesÿmade)

            .ÿlabelÿvariableÿsexÿSex

            .ÿsummarizeÿsex,ÿmeanonly

            .ÿlabelÿdefineÿSexesÿ0ÿMÿ1ÿFÿ`r(max)'ÿU

            .ÿlabelÿvaluesÿsexÿSexes

            .ÿ
            .ÿgenerateÿbyteÿagpÿ=ÿageÿifÿ!missing(age)

            .ÿlabelÿvariableÿagpÿ"AgeÿGroup"

            .ÿlabelÿcopyÿage_ÿAgeGroups

            .ÿlabelÿvaluesÿagpÿAgeGroups

            .ÿ
            .ÿdropÿdepigmentedÿgenderÿageÿpercent

            .ÿlabelÿdropÿdepigmentedÿgender_ÿage_ÿpercent_

            .ÿ
            .ÿ*
            .ÿ*ÿBeginÿhere
            .ÿ*
            .ÿpreserve

            .ÿcontractÿscoÿracÿdpgÿsexÿagp,ÿfreq(count)

            .ÿlistÿifÿcountÿ>ÿ1,ÿnoobsÿseparator(0)

            ÿÿ+---------------------------------------+
            ÿÿ|ÿscoÿÿÿracÿÿÿdpgÿÿÿsexÿÿÿÿÿagpÿÿÿcountÿ|
            ÿÿ|---------------------------------------|
            ÿÿ|ÿÿÿ0ÿÿÿÿÿ1ÿÿÿÿÿYÿÿÿÿÿMÿÿÿ40-49ÿÿÿÿÿÿÿ3ÿ|
            ÿÿ|ÿÿÿ0ÿÿÿÿÿ1ÿÿÿÿÿYÿÿÿÿÿMÿÿÿÿÿ70+ÿÿÿÿÿÿÿ3ÿ|
            ÿÿ|ÿÿÿ0ÿÿÿÿÿ1ÿÿÿÿÿYÿÿÿÿÿFÿÿÿ40-49ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ|ÿÿÿ1ÿÿÿÿÿ1ÿÿÿÿÿNÿÿÿÿÿUÿÿÿÿÿ70+ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ|ÿÿÿ3ÿÿÿÿÿ1ÿÿÿÿÿYÿÿÿÿÿFÿÿÿ60-69ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ|ÿÿÿ5ÿÿÿÿÿ1ÿÿÿÿÿNÿÿÿÿÿMÿÿÿ30-39ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ|ÿÿÿ5ÿÿÿÿÿ1ÿÿÿÿÿNÿÿÿÿÿMÿÿÿÿÿ70+ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ|ÿÿ13ÿÿÿÿÿ1ÿÿÿÿÿNÿÿÿÿÿMÿÿÿ60-69ÿÿÿÿÿÿÿ2ÿ|
            ÿÿ+---------------------------------------+

            .ÿrestore

            .ÿ
            .ÿregressÿscoÿi.(dpgÿracÿsexÿagp)

            ÿÿÿÿÿÿSourceÿ|ÿÿÿÿÿÿÿSSÿÿÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿMSÿÿÿÿÿÿNumberÿofÿobsÿÿÿ=ÿÿÿÿÿÿÿÿ77
            -------------+----------------------------------ÿÿÿF(11,ÿ65)ÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ4.09
            ÿÿÿÿÿÿÿModelÿ|ÿÿ1906.50491ÿÿÿÿÿÿÿÿ11ÿÿ173.318628ÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿ=ÿÿÿÿ0.0001
            ÿÿÿÿResidualÿ|ÿÿ2755.85873ÿÿÿÿÿÿÿÿ65ÿÿ42.3978265ÿÿÿR-squaredÿÿÿÿÿÿÿ=ÿÿÿÿ0.4089
            -------------+----------------------------------ÿÿÿAdjÿR-squaredÿÿÿ=ÿÿÿÿ0.3089
            ÿÿÿÿÿÿÿTotalÿ|ÿÿ4662.36364ÿÿÿÿÿÿÿÿ76ÿÿÿÿ61.34689ÿÿÿRootÿMSEÿÿÿÿÿÿÿÿ=ÿÿÿÿ6.5114

            ------------------------------------------------------------------------------
            ÿÿÿÿÿÿÿÿÿscoÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
            -------------+----------------------------------------------------------------
            ÿÿÿÿÿÿÿÿÿdpgÿ|
            ÿÿÿÿÿÿÿÿÿÿYÿÿ|ÿÿ-4.228997ÿÿÿ1.650077ÿÿÿÿ-2.56ÿÿÿ0.013ÿÿÿÿ-7.524429ÿÿÿ-.9335647
            ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
            ÿÿÿÿÿÿÿÿÿracÿ|
            ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿÿ11.84553ÿÿÿ2.951428ÿÿÿÿÿ4.01ÿÿÿ0.000ÿÿÿÿÿ5.951117ÿÿÿÿ17.73994
            ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿÿ3.604327ÿÿÿ2.194306ÿÿÿÿÿ1.64ÿÿÿ0.105ÿÿÿÿ-.7780055ÿÿÿÿ7.986659
            ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
            ÿÿÿÿÿÿÿÿÿsexÿ|
            ÿÿÿÿÿÿÿÿÿÿFÿÿ|ÿÿ-1.896902ÿÿÿÿ2.25576ÿÿÿÿ-0.84ÿÿÿ0.403ÿÿÿÿ-6.401965ÿÿÿÿ2.608161
            ÿÿÿÿÿÿÿÿÿÿUÿÿ|ÿÿ-4.995689ÿÿÿÿ3.24005ÿÿÿÿ-1.54ÿÿÿ0.128ÿÿÿÿ-11.46652ÿÿÿÿ1.475136
            ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
            ÿÿÿÿÿÿÿÿÿagpÿ|
            ÿÿÿÿÿÿ21-29ÿÿ|ÿÿÿÿÿÿÿ-2.5ÿÿÿ7.974756ÿÿÿÿ-0.31ÿÿÿ0.755ÿÿÿÿ-18.42669ÿÿÿÿ13.42669
            ÿÿÿÿÿÿ30-39ÿÿ|ÿÿÿ7.285605ÿÿÿ6.145394ÿÿÿÿÿ1.19ÿÿÿ0.240ÿÿÿÿ-4.987595ÿÿÿÿÿ19.5588
            ÿÿÿÿÿÿ40-49ÿÿ|ÿÿÿ3.041982ÿÿÿ5.236513ÿÿÿÿÿ0.58ÿÿÿ0.563ÿÿÿÿ-7.416055ÿÿÿÿ13.50002
            ÿÿÿÿÿÿ50-59ÿÿ|ÿÿÿ5.034389ÿÿÿ5.307474ÿÿÿÿÿ0.95ÿÿÿ0.346ÿÿÿÿ-5.565368ÿÿÿÿ15.63415
            ÿÿÿÿÿÿ60-69ÿÿ|ÿÿÿ7.438065ÿÿÿÿ5.27274ÿÿÿÿÿ1.41ÿÿÿ0.163ÿÿÿÿ-3.092324ÿÿÿÿ17.96845
            ÿÿÿÿÿÿÿÿ70+ÿÿ|ÿÿÿ4.502437ÿÿÿ5.400207ÿÿÿÿÿ0.83ÿÿÿ0.407ÿÿÿÿ-6.282522ÿÿÿÿÿ15.2874
            ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
            ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿ3.12467ÿÿÿ5.374544ÿÿÿÿÿ0.58ÿÿÿ0.563ÿÿÿÿ-7.609036ÿÿÿÿ13.85838
            ------------------------------------------------------------------------------

            .ÿ
            .ÿlocalÿjsÿ=ÿfloor(10000ÿ*ÿruniform())

            .ÿrvfplotÿ,ÿmcolor(black)ÿmsize(vsmall)ÿjitter(2)ÿjitterseed(`js')ÿ///
            >ÿÿÿÿÿÿÿÿÿylabel(ÿ,ÿangle(horizontal)ÿnogrid)ÿyline(0,ÿlpattern(dash)ÿlcolor(black))

            .ÿquietlyÿgraphÿexportÿrvf.png,ÿreplace

            .ÿ
            .ÿ
            .ÿpredictÿdoubleÿres,ÿresiduals

            .ÿ
            .ÿlocalÿjsÿ=ÿfloor(10000ÿ*ÿruniform())

            .ÿpnormÿres,ÿmcolor(black)ÿmsize(vsmall)ÿjitter(2)ÿjitterseed(`js')ÿ///
            >ÿÿÿÿÿÿÿÿÿrlopts(lcolor(black))ÿylabel(ÿ,ÿangle(horizontal)ÿnogrid)

            .ÿquietlyÿgraphÿexportÿpnp.png,ÿreplace

            .ÿ
            .ÿlocalÿjsÿ=ÿfloor(10000ÿ*ÿruniform())

            .ÿqnormÿres,ÿmcolor(black)ÿmsize(vsmall)ÿjitter(2)ÿjitterseed(`js')ÿ///
            >ÿÿÿÿÿÿÿÿÿrlopts(lcolor(black))ÿylabel(ÿ,ÿangle(horizontal)ÿnogrid)

            .ÿquietlyÿgraphÿexportÿqnp.png,ÿreplace

            .ÿ
            .ÿ//ÿTheÿoutlierÿisÿtheÿmost-negativeÿresidual
            .ÿpredictÿdoubleÿsco_hat,ÿxb

            .ÿformatÿsco_hatÿresÿ%03.1f

            .ÿsortÿres

            .ÿlistÿdpgÿracÿsexÿagpÿscoÿsco_hatÿresÿinÿ1,ÿnoobs

            ÿÿ+-------------------------------------------------+
            ÿÿ|ÿdpgÿÿÿracÿÿÿsexÿÿÿÿÿagpÿÿÿscoÿÿÿsco_hatÿÿÿÿÿresÿ|
            ÿÿ|-------------------------------------------------|
            ÿÿ|ÿÿÿNÿÿÿÿÿ2ÿÿÿÿÿMÿÿÿ50-59ÿÿÿÿÿ5ÿÿÿÿÿÿ20.0ÿÿÿ-15.0ÿ|
            ÿÿ+-------------------------------------------------+

            .ÿ
            .ÿexit

            endÿofÿdo-file


            .


            Click image for larger version

Name:	rvf.png
Views:	1
Size:	21.9 KB
ID:	1644418


            Click image for larger version

Name:	pnp.png
Views:	1
Size:	29.5 KB
ID:	1644419


            Click image for larger version

Name:	qnp.png
Views:	1
Size:	22.0 KB
ID:	1644420

            Comment


            • #7
              The very different-seeming advice may be less different than it appears. In general, different models may appeal for slightly different reasons. One specific appeal of a logit model is that predictions within bounds are assured, but if the mean function keeps well inside the allowed range, the curvature of the logit over the observed range and the straightness of the plain regression can seem close and regression predictions will behave as a matter of fact rather than of principle, as Joseph Coveney explains nicely from his point of view, which doesn't, I think, strongly contradict any other views expressed.

              I add two further comments. One is that quite different models can be compared in terms of their predicted values as well as the detail of coefficients. That boils down to simple scatter plots, say predicted from plain regression versus predicted from a generalized linear model.

              Another is to counsel caution: If you kept to the predictors you were focusing on, only 9 distinct combinations are evident in the data. Using distinct from the Stata Journal, I get

              Code:
              . distinct depigmented  raceconsol gender, joint
              
              ----------------------------------
                         |     total   distinct
              -----------+----------------------
               (jointly) |        72          9
              ----------------------------------

              Comment


              • #8
                Thank you both! All things taken into consideration, I'll use -regress- with the -robust- option for simplicity since it seems to be an acceptable way of modeling this type of data. I have added age back (as an ordinal variable, instead of categorical); this increases the number of distinct combinations to 27. The plots look a little more wonky than Joseph's, but I think close enough?

                Thanks again!

                Click image for larger version

Name:	Screen Shot 2022-01-10 at 8.30.20 PM.png
Views:	4
Size:	120.0 KB
ID:	1644577


                Click image for larger version

Name:	rvf_Depigmentation.png
Views:	1
Size:	84.7 KB
ID:	1644572
                Click image for larger version

Name:	pnorm_Depigmentation.png
Views:	2
Size:	119.3 KB
ID:	1644574
                Click image for larger version

Name:	qnorm_Depigmentation.png
Views:	1
Size:	97.5 KB
ID:	1644575
                Attached Files

                Comment


                • #9
                  Consider the plot of residual versus fitted. The definition of residual is naturally

                  observed MINUS fitted

                  so any particular observed value defines a line on that plot. For example, all observed values of 0 lie on the line

                  residual = 0 MINUS fitted

                  = MINUS fitted

                  which you should trace mentally on the graph. The constraint that the observed lies between 0 and 35 limit the allowed region on that plot.

                  Depending on what this is, ranging from an assignment to something you intend to publish if you can, you'll probably be well advised to discuss that.

                  I usually draw the line at thinking regression acceptable if any fitted value is impossible (e.g. negative), but your model passes that criterion.

                  See again #2 on robust standard errors if you need.

                  Summary: Your model seems defensible to me. I would prefer a logit model, but it's your project.

                  Detail. I picked up from John Tukey a habit of sometimes writing PLUS. MINUS, etc, in word equations. It has a small advantage in particular over small minus signs that might not be noticed so easily. Sorry if it seems like SHOUTING, but the emphasis does not, I trust, do much harm.

                  Comment


                  • #10
                    Thank you! Really helpful, as always. Planning on submitting this for publication, so I'll be sure to discuss that as you suggested. As a side note, sorry my last post had huge duplicates of some of the images - not intentional, my mistake.

                    Anyway, your help is much appreciated!

                    Comment

                    Working...
                    X