Hi all,
I am comparing two vairables, ILLWK (whether or not a person was ill in a reference week) and DAYSILL (how many days they were ill if so), between 4 different occupations. ILLWK takes the value 1 if the person was ill in the week and 0 otherwise. DAYSILL can take any integer in between 0 and 7. I have over 48,000 observations as I am using quarterly data from between 2012 and 2017. The majority of responses for both variables are 0 but I have around 1,000 observations that report being ill and taking a value between 1 and 7. I was planning on running an ANOVA test however my data is not normally distributed and there is not equality of variances. This rules out running a Kruskal-Wallis test and also a Welch Test, leaving the Brown-Forsyth test. However, I have read that the ANOVA is quite robust when you have thousands of observations, leaving me in doubt about which tests to run.
Thanks for your help
I am comparing two vairables, ILLWK (whether or not a person was ill in a reference week) and DAYSILL (how many days they were ill if so), between 4 different occupations. ILLWK takes the value 1 if the person was ill in the week and 0 otherwise. DAYSILL can take any integer in between 0 and 7. I have over 48,000 observations as I am using quarterly data from between 2012 and 2017. The majority of responses for both variables are 0 but I have around 1,000 observations that report being ill and taking a value between 1 and 7. I was planning on running an ANOVA test however my data is not normally distributed and there is not equality of variances. This rules out running a Kruskal-Wallis test and also a Welch Test, leaving the Brown-Forsyth test. However, I have read that the ANOVA is quite robust when you have thousands of observations, leaving me in doubt about which tests to run.
Thanks for your help
Comment