Post-hoc testing in Kruskal-Wallis using R
Exercise 10.2.3.
Statistics for Ecologists (Edition 2) Exercise 10.2.3
These notes are about post hoc analysis when you use the non-parametric Kruskal-Wallis test. They supplement Chapter 10 in relation to using R to calculate differences between >2 non-parametric samples.
Post-hoc testing in the Kruskal-Wallis test using R
Introduction
The Kruskal-Wallis test is a non-parametric test for differences between more than two samples. It is essentially an analogue for a one-way anova. There is no “standard” method for carrying out post hoc analysis for KW tests. These notes show you how you can use a modified form of the U-test to carry out post hoc analysis.
When you carry out a Kruskal-Wallis test you are looking at the ranks of the data and comparing them. If the ranks are sufficiently different between samples you may be able to determine that these differences are statistically significant.
However, the main analysis only tells you about the overall situation, the result tells you that “there are differences”. Look at the following graph, which shows three samples.
A Kruskal-Wallis test of these data gives a significant result, H = 6.54 p < 0.05, but does not give any information about the pair by pair comparisons. Looking at the graph you might suppose that the Upper and Mid samples are perhaps not significantly different as their error bars (IQR) overlap considerably. The Lower sample appears perhaps to be different from the other two.
The way to determine these pairwise differences is with a post hoc test. You cannot simply carry out regular U-tests because you’ll need to carry out at least 3 and this “boosts” the chances of you getting a significant result by a factor of 3. You could simply adjust the p-values (e.g. using a Bonferroni correction), but this is generally too conservative.
These notes show you how to carry out a modified version of the U-test as a post hoc tool. The approach is analogous to the Tukey HSD test you’d use with parametric data.
A modified U-test as a post-hoc tool
With a bit of tinkering you can modify the formula for the U test to produce the following:
A critical value for U in a post hoc test for Kruskal-Wallis. Q is the Studentized range and n the harmonic mean of sample sizes.
In the formula n is the harmonic mean of the sample sizes being compared. and Q is the value of the Studentized Range for df = Inf, and number of groups equal to the original number of samples.
Critical values of Q, the Studentized range.
Number of groups |
Significance | |
5% | 1% | |
2 | 2.772 | 3.643 |
3 | 3.314 | 4.120 |
4 | 3.633 | 4.403 |
5 | 3.858 | 4.603 |
6 | 4.030 | 4.757 |
The formula calculates a critical U-value for the pairwise comparison. You simply carry out a regular U-test, then use the largest U-value as a test statistic. If your value is equal or larger than the critical value from the formula, then the pairwise comparison is a statistically significant one.
Harmonic mean
The harmonic mean is easy to determine:
Calculating the harmonic mean of two values.
The harmonic mean is a way of overcoming differences in sample size. The more different the sample sizes the more unreliable this approach will be.
Calculate Q directly
If you re-arrange the formula you can calculate a value for Q:
Calculate a value for Q (Studentized range) from the result of a U-test.
You can now use the result of a U-test (use the larger of the two calculated U-values) to work out a value for Q. Now you can compare your Q-value to the critical value.
The Studentized range is a distribution built into the basic R program. This gives you a way to compute an exact p-value for the pairwise comparison.
Custom R functions for Kruskal-Wallis post hoc
I’ve produced four custom functions for use with Kruskal-Wallis post hoc tests:
- mean() – Calculates the harmonic mean of two values.
- post() – Calculates the post hoc results for a pairwise comparison of two samples from a larger dataset.
- post() – Calculates the exact p-value for a post hoc given a U-value, number of groups and sample sizes.
- post() – Calculates a critical U-value for a post hoc given a confidence level (default 95%), number of groups and sample sizes.
These functions are contained in a single file: KW posthoc.R. If you source() the file you will set-up the functions and see a message giving some idea of what the functions do.
The h.mean() function
h.mean(n1, n2) |
|
n1, n2 | Numerical values representing sample sizes of two samples. |
This function simply returns the harmonic mean of two numbers, i.e. the sample sizes of two samples.
h.mean(5, 7) [1] 5.833333
The function is called by the other post hoc functions (and is built-in) but it might be “handy” to have separately.
The KW.post() function
KW.post(x, y, data) |
|
x, y | Numeric samples to compare. |
data | The data object that contains the samples. It is assumed that the data is in sample format with multiple columns, each representing a separate sample. |
The function returns several results as a list:
- uval – The calculated U value for the pairwise comparison (the larger of the two calculated values).
- crit – The critical U value at 95%.
- value – The exact probability for the pairwise comparison.
The function also displays the results to the console, even if you assign the result to a named object.
hog3 Upper Mid Lower 1 3 4 11 2 4 3 12 3 5 7 9 4 9 9 10 5 8 11 11 6 10 NA NA 7 9 NA NA KW.post(Upper, Lower, data = hog3) Data: hog3 Pairwize comparison of: Upper and Lower U-value: 32.5 U-crit (95%): 31.06011 Post-hoc p-value: 0.02641968
If your data are in scientific recording layout, that is you have a response variable and a predictor variable, then you need a slightly different approach. You’ll have to work out a U-test result first, then run the KWp.post() and/or KWu.post() commands (see below).
The KWp.post() function
KWp.post(Uval, grp, n1, n2) |
|
Uval | A calculated U-value for the pairwise comparison (the larger of the two calculated values). |
grp | The number of groups (samples) in the original Kruskal-Wallis test. |
n1, n2 | The sample sizes of the two groups being analysed. |
This function returns an exact p-value for a post hoc analysis. The value is returned immediately.
KWp.post(18, grp = 3, 7, 5)
Post-hoc p-value: 0.9851855
You can carry out a wilcox.test() on two samples to obtain a U-value. You need to know samples sizes because you need the larger of the two U values.
The wilcox.test() only gives one value so you must work out if the value you got was the largest. It so happens that:
n1 * n2 = U1 + U2
This means that you can work out the alternative U-value easily if you know sample sizes.
If your data are in recording layout, with a predictor and response, you’ll need to use the subset parameter to carry out the pairwise test:
hog2 count site 1 3 Upper 2 4 Upper 3 5 Upper 4 9 Upper 5 8 Upper 6 10 Upper 7 9 Upper 8 4 Mid 9 3 Mid 10 7 Mid 11 9 Mid 12 11 Mid 13 11 Lower 14 12 Lower 15 9 Lower 16 10 Lower 17 11 Lower wilcox.test(count ~ site, data = hog2, subset = site %in% c("Upper", "Lower")) Wilcoxon rank sum test with continuity correction data: count by site W = 32.5, p-value = 0.01732 alternative hypothesis: true location shift is not equal to 0
Check the sample sizes:
replications(count ~ site, data = hog2) $site site Lower Mid Upper 5 5 7
Now you can see if the U-value you got was the largest:
5*7 – 32.5 [1] 2.5
Since it is the largest, you can use it in the KWp.post() function:
KWp.post(32.5, grp = 3, 5, 7) Post-hoc p-value: 0.02641968
Generally speaking the wilcox.test() will return the largest U-value if you use the response ~ predictor format for the command. If you run the command on separate samples the returned U-value will depend on the order you specify the samples.
The KWu.post() function
KWu.post(CI = 0.95, grp, n1, n2) |
|
CI = 0.95 | The confidence interval, defaults to 0.95. This is essentially a significance level of p = 0.05. |
grp | The number of groups (samples) in the original Kruskal-Wallis test. |
n1, n2 | The sample sizes of the two groups being analysed. |
This function returns a U-value for a given confidence level. You supply the number of groups (samples) in the original Kruskal-Wallis test and the sizes of the two samples being compared. The result is a critical value, which means you can carry out the wilcox.test() and compare the resulting U-value to this critical value.
KWu.post(CI = c(0.95, 0.99), grp = 3, n1 = 5, n2 = 5)
Post-hoc critical U value: 23.71961 26.44729
In the example you see that you can set multiple confidence intervals. Here the critical U values for p = 0.05 and p = 0.01 are returned.
Comments are closed.