Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kruskal–Wallis one-way analysis of variance

    Hello everyone,
    I am bothered by a question regarding group differences in my data set. I have a dataset with variables (ordinal, dummy, and intervall) from 10 different communities. I want to run multiple regressions with the overall sample. However, I also want to check whether some of the central constructs of the analysis vary between the communities. Since comparing the 10 communities with each other in a descriptive way is a lot of work, I'd like to run a test that indicates whether the variance of a construct can be partially explained by the group differences (i.e. belonging to the different communities). I extracted from the literature that this is usually done via one-way Anova. As most of my data is non-normally distributed, I was wondering whether the Kruskal–Wallis one-way analysis of variance would be the right test for me?
    I used the follwoing command:
    kwallis var, by(communities)

    Can anyone tell me whether I am on the right track?

    Many thanks in advance!!

  • #2
    Andreas:
    you may want to consider using interaction between categorical and continuous predictors in multiple regression (please, see -fvvarlist-).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Non-normal distributions of predictors is no barrier to multiple regression. If it were, indicator variables could hardly be used legitimately! So, using Kruskal-Wallis here is just a diversion that may tell you something about your data but otherwise is of very limited relevance to a modelling goal.

      Comment


      • #4
        Thank you Carlo and Nick for your replies.

        Indeed, I just want to see for instance whether the variable income significantly varies between the 10 communities before I enter income into the multiple regression models. Is the Kruskal Wallis test as indicated above an appropriate test to do so?

        Comment


        • #5
          Andreas:
          anova is (quite) robust to non-normality, but, all in all, is often another way to spell the word linear regression, but without the adjustement for other predictors, that may well affect (condition) the difference in the mean income you're interested in.
          As you are surely aware of, you can measure what you're after with a simple linear regression instead of -anova-:
          Code:
          regress income i.country
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thanks Carlo, you have a good point. However, I am concerned about the fact that most of my variables are non-normally distributed and would fail the KS-Test (please note that the sample size is just N=220). Therefore I am looking for non-parametric test solutions.

            Comment


            • #7
              Andreas:
              the problem with KW is that you cannot (easily) performs multiple comparisons.
              However, if you're interested in the overall comparison among countries only, it looks fine.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thanks, that helped. I tried your suggestion from above. Instead of countries, I used villages.
                So here I compare how much variance of the DV (environmental concern on a 5-point likert) is explained by the group differences. But how do I interpret the
                coefficiants? "Village 2" loads significantly on the DV. Would it be correct to interpret that people from village 2 have significantly higher environmental concerns than people from other villages within the sample?


                Click image for larger version

Name:	statalist reg villages.PNG
Views:	2
Size:	21.2 KB
ID:	1313160


                Attached Files

                Comment


                • #9
                  Andreas.
                  assuming that in your research field treating a Likert scale as an interval variables is OK (as many consider acceptable), i would say that inhabitants of village_2 show statistical significant different concerns about enviromental issues vs village_1. It their concerns are higher/lower vs village_1 (the reference category, which is embedded in the constant) it is conditional on the way the Likert scale is oriented (1. lowest concerns...or the other way round).
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    That is great Carlo thank you for your advise. I normally avoid treating likert as intervall scale, but I wanted to understand how your suggestion works.

                    Am I right assuming that the same procedure can be applied within an ordered logit model?

                    Comment


                    • #11
                      Andeas:
                      I would take a look at -help ologit-.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Nick and Carlo already gave full advice and underlined the main issues. That said, I think that, provided you have 220 observations divided in 10 different communities, perhaps neither an one-way ANOVA nor the Kruskal-Wallis test would give what you wish, i.e., "to check whether some of the central constructs of the analysis vary between the communities". What is more, I fear that, may you get a "significant" p-value, it would only point to a difference between at least one community and the others. Not much information, though. Performing post hoc estimations would eventually take its toll due to familywise error, let alone the issue of relying on unadjusted analysis (I mean, without the covariates) and employing Likert scale as interval variable. That being the case, I wonder if structural equation modeling - sem - wouldn't apply to your demands...

                        Best,
                        Marcos
                        Best regards,

                        Marcos

                        Comment


                        • #13
                          Thanks for your advise Marcos, I am not very familiar with the application of SEM but I will look into it.
                          Also thank you Carlo for all the time and thoughts that you invested here! Your help is very much appreciated.

                          I applied ologit with factor variables and the result are almost the same as for a linear model.

                          Comment

                          Working...
                          X