Kruskal–Wallis one-way analysis of variance

Andreas Head

Join Date: Jun 2014

Posts: 60
#1

Kruskal–Wallis one-way analysis of variance

15 Oct 2015, 05:04

Hello everyone,
I am bothered by a question regarding group differences in my data set. I have a dataset with variables (ordinal, dummy, and intervall) from 10 different communities. I want to run multiple regressions with the overall sample. However, I also want to check whether some of the central constructs of the analysis vary between the communities. Since comparing the 10 communities with each other in a descriptive way is a lot of work, I'd like to run a test that indicates whether the variance of a construct can be partially explained by the group differences (i.e. belonging to the different communities). I extracted from the literature that this is usually done via one-way Anova. As most of my data is non-normally distributed, I was wondering whether the Kruskal–Wallis one-way analysis of variance would be the right test for me?
I used the follwoing command:
kwallis var, by(communities)

Can anyone tell me whether I am on the right track?

Many thanks in advance!!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

15 Oct 2015, 05:15

Andreas:
you may want to consider using interaction between categorical and continuous predictors in multiple regression (please, see -fvvarlist-).

Kind regards,
Carlo
(Stata 19.0)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35435
#3

15 Oct 2015, 05:57

Non-normal distributions of predictors is no barrier to multiple regression. If it were, indicator variables could hardly be used legitimately! So, using Kruskal-Wallis here is just a diversion that may tell you something about your data but otherwise is of very limited relevance to a modelling goal.
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#4

15 Oct 2015, 07:12

Thank you Carlo and Nick for your replies.

Indeed, I just want to see for instance whether the variable income significantly varies between the 10 communities before I enter income into the multiple regression models. Is the Kruskal Wallis test as indicated above an appropriate test to do so?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#5

15 Oct 2015, 07:30

Andreas:
anova is (quite) robust to non-normality, but, all in all, is often another way to spell the word linear regression, but without the adjustement for other predictors, that may well affect (condition) the difference in the mean income you're interested in.
As you are surely aware of, you can measure what you're after with a simple linear regression instead of -anova-:

Code:

regress income i.country

Kind regards,
Carlo
(Stata 19.0)
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#6

15 Oct 2015, 07:39

Thanks Carlo, you have a good point. However, I am concerned about the fact that most of my variables are non-normally distributed and would fail the KS-Test (please note that the sample size is just N=220). Therefore I am looking for non-parametric test solutions.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#7

15 Oct 2015, 07:48

Andreas:
the problem with KW is that you cannot (easily) performs multiple comparisons.
However, if you're interested in the overall comparison among countries only, it looks fine.

Kind regards,
Carlo
(Stata 19.0)
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#8

15 Oct 2015, 08:59

Thanks, that helped. I tried your suggestion from above. Instead of countries, I used villages.
So here I compare how much variance of the DV (environmental concern on a 5-point likert) is explained by the group differences. But how do I interpret the
coefficiants? "Village 2" loads significantly on the DV. Would it be correct to interpret that people from village 2 have significantly higher environmental concerns than people from other villages within the sample?

Attached Files
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#9

15 Oct 2015, 09:19

Andreas.
assuming that in your research field treating a Likert scale as an interval variables is OK (as many consider acceptable), i would say that inhabitants of village_2 show statistical significant different concerns about enviromental issues vs village_1. It their concerns are higher/lower vs village_1 (the reference category, which is embedded in the constant) it is conditional on the way the Likert scale is oriented (1. lowest concerns...or the other way round).

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#10

15 Oct 2015, 09:50

That is great Carlo thank you for your advise. I normally avoid treating likert as intervall scale, but I wanted to understand how your suggestion works.

Am I right assuming that the same procedure can be applied within an ordered logit model?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#11

15 Oct 2015, 11:27

Andeas:
I would take a look at -help ologit-.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#12

15 Oct 2015, 13:59

Nick and Carlo already gave full advice and underlined the main issues. That said, I think that, provided you have 220 observations divided in 10 different communities, perhaps neither an one-way ANOVA nor the Kruskal-Wallis test would give what you wish, i.e., "to check whether some of the central constructs of the analysis vary between the communities". What is more, I fear that, may you get a "significant" p-value, it would only point to a difference between at least one community and the others. Not much information, though. Performing post hoc estimations would eventually take its toll due to familywise error, let alone the issue of relying on unadjusted analysis (I mean, without the covariates) and employing Likert scale as interval variable. That being the case, I wonder if structural equation modeling - sem - wouldn't apply to your demands...

Best,
Marcos

Best regards,

Marcos
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#13

15 Oct 2015, 20:51

Thanks for your advise Marcos, I am not very familiar with the application of SEM but I will look into it.
Also thank you Carlo for all the time and thoughts that you invested here! Your help is very much appreciated.

I applied ologit with factor variables and the result are almost the same as for a linear model.
Comment

Announcement

Kruskal–Wallis one-way analysis of variance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment