# R code for Bonferroni

Born Mnncus, Enro PERITZ AND K. R. GABRIEL ..... modiﬁed Williams's statistic R, 1976. On closed testing procedures with special reference to Ordered Analysis of Variance. - Wharton ...

- http://webdev.cas.msu.edu/cas992/weeks/week10.html Multiple Comparisons
- Multiple Comparisons
- http://stat.ethz.ch/R-manual/R-patched/library/stats/html/00Index.html
- http://stat.ethz.ch/R-manual/R-patched/library/stats/html/p.adjust.html
- http://stat.ethz.ch Seminar for statistics

## Contents

## Code sent to Tom Uram & Editors

#This is the code for the workflow from h[3] DEf2 to the calculation of the Holm-Bonferroni test #p=series of n=4 probabilities max_i is the limit selected for pvalue=.05 or .10 #For DEf01d use output hcpval or bootpval (starred) in output h[4] for significance

m=4 #example for number of pvalues from h[3] DEf2 p=c(.005,.01,.03,.04) #example for sorted pvalues from h[3] DEf2 low to high a=p max_i=.05 for (j in 1:m ) { p[j]=max_i/(m-j+1)<p[j] #.005<(.05/4) .01<(.05/3) .03<(.05/2) .04<(.05/1) if (p[j]==1) { b=a[j] } #accept null i.e., level at which observed pvalue is not significant in H-B group test #type p - the first 1 in the series signals the cutoff probability that is still significant #type b - is that cutoff probability } p #[1] 0 0 1 0 # 0 0 1 0 test says null hypothesis rejected/rejected/accepted --> accepted (i.e., accept at the level of the first null accepted, which is pvalue=.03 b #[1] 0.03

(The first workflow) Holm-Bonferroni

P.S. If the Bonferroni doesnt work for you send me an email with new values as in my example, after you have rank ordered the significance test.

- In the example, with four values, variables are significant in the group test if they are p=.03 or less.

## R code obsolete

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/p.adjust.html library(stats) p<-data.frame(v=1:4,ch=c(.0001,.001,.002,.04)) #<-- drw example p v ch 1 1 1e-04 2 2 1e-03 3 3 2e-03 4 4 4e-02

p.adjust.methods # c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

--Examples RUN ALL AT ONCE--

require(graphics)

set.seed(123) x <- rnorm(50, mean = c(rep(0, 25), rep(3, 25))) p <- 2*pnorm(sort(-abs(x)))

round(p, 3) round(p.adjust(p), 3) #round(p.adjust(p, "BH"), 3) round(p.adjust(p, "holm"), 3)

## or all of them at once (dropping the "fdr" alias): p.adjust.M <- p.adjust.methods[p.adjust.methods != "fdr"] p.adj <- sapply(p.adjust.M, function(meth) p.adjust(p, meth)) p.adj.60 <- sapply(p.adjust.M, function(meth) p.adjust(p, meth, n = 60)) stopifnot(identical(p.adj[,"none"], p), p.adj <= p.adj.60) round(p.adj, 3) ## or a bit nicer: noquote(apply(p.adj, 2, format.pval, digits = 3))

## and a graphic: matplot(p, p.adj, ylab="p.adjust(p, meth)", type = "l", asp = 1, lty = 1:6, main = "P-value adjustments") legend(0.7, 0.6, p.adjust.M, col = 1:6, lty = 1:6) #DOES GRAPHIC

## Can work with NA's: pN <- p; iN <- c(46, 47); pN[iN] <- NA pN.a <- sapply(p.adjust.M, function(meth) p.adjust(pN, meth)) ## The smallest 20 P-values all affected by the NA's : round((pN.a / p.adj)[1:20, ] , 4)

## leftover

p.adjust.M <- p.adjust.methods[p.adjust.methods != "fdr"] p.adjust.M [1] "holm" "hochberg" "hommel" "bonferroni" "BH" "BY" "none"

p.adj <- sapply(p.adjust.M, function(meth) p.adjust(p, meth)) p.adj.60 <- sapply(p.adjust.M, function(meth) p.adjust(p, meth, n = 60)) stopifnot(identical(p.adj[,"none"], p), p.adj <= p.adj.60) round(p.adj, 3)

require(graphics)

set.seed(123) x <- rnorm(50, mean = c(rep(0, 25), rep(3, 25))) p <- 2*pnorm(sort(-abs(x)))

round(p, 3) round(p.adjust(p), 3) round(p.adjust(p, "BH"), 3)

## Tutorial

http://rtutorialseries.blogspot.com/2011/03/r-tutorial-series-anova-pairwise.html #Bonferroni adjustment pairwise.t.test(dataPairwiseComparisons$StressReduction, dataPairwiseComparisons$Treatment, p.adj = "bonferroni") #Error in factor(g) : object 'dataPairwiseComparisons' not found

## UCLA

http://www.ats.ucla.edu/stat/r/faq/posthoc.htm hsb2<-read.table("http://www.ats.ucla.edu/stat/r/notes/hsb2.csv", sep=",", header=T) attach(hsb2)

## Other

http://www.stat.wisc.edu/~yandell/st571/R/append12.pdf

## Inanswered email to Pat Gray

You're the first to see the talk on CoSSci we're giving at INSNA next month

But ... in the new setup I'm trying to take the signficance tests for the indepvar in a DEf model, sort them, and run a holm (-bonferroni) test (e.g., "The Holm adjustment sequentially compares the lowest p-value with a Type I error rate that is reduced for each consecutive test. In our case, this means that our first p-value is tested at the .05/3 level (.017), second at the .05/2 level (.025), and third at the .05/1 level (.05). This method is generally considered superior to the Bonferroni adjustment and can be employed using p.adj = "holm" in the pairwise.t.test() function.)

There is a nice R function for this: p.adjust {stats} R Documentation Adjust P-values for Multiple Comparisons

This runs for me below giving a graph. But all I want is the holm adjustment, a single level of significance, and simply one number, without the graphics. This will simple be another labeled piece of output.

I'm waiting for the new CoSSci quick-turnaround DEf modeling window to be programmed soon at a VM attached to the main Galaxy but not sending data into the queue at SDSC Trestles. The user enters variables at the window shown in the *.pptx and results come back even more quickly than at a mac at home. There are many interactions and usually 12-20 variables at the end of many 30 iteractive modeling improvements. Then final models and bigger projects will go thru the queue at the supercomputer.

When the VM is appropriately programed which should be in a few weeks or so, then I am going to do the next big round of invitations to authors (if they are doing DEf models in some form) because the new VM coupled with the CoSSci Galaxy will make their modeling so much easier and good results easier to obtain. But both the holm-Bonferroni tests for multiple variables and the MCMCregress (package MCMCpack) Bayesian goodness-of-fit tests will be needed to avoid overfitting. So this is one of them that needs to be simplified (below) and its still a bit above my pay grade. If you can help, again, very much appreciated.

BTW what do you think about inviting many of the authors who have published new codes at World Cultures to get invitations to focus on several of their variables and do a chapter which brings out new results, issues, questions?

-- best, Doug White http://intersci.ss.uci.edu/wiki/index.php/DW_home

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/p.adjust.html library(stats) p<-data.frame(v=1:4,ch=c(.0001,.001,.002,.04)) #<-- drw example p v ch 1 1 1e-04 2 2 1e-03 3 3 2e-03 4 4 4e-02

p.adjust.methods # c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

--Examples RUN ALL AT ONCE--

require(graphics)

set.seed(123) x <- rnorm(50, mean = c(rep(0, 25), rep(3, 25))) p <- 2*pnorm(sort(-abs(x)))

round(p, 3) round(p.adjust(p), 3) #round(p.adjust(p, "BH"), 3) round(p.adjust(p, "holm"), 3)

## or all of them at once (dropping the "fdr" alias): p.adjust.M <- p.adjust.methods[p.adjust.methods != "fdr"] p.adj <- sapply(p.adjust.M, function(meth) p.adjust(p, meth)) p.adj.60 <- sapply(p.adjust.M, function(meth) p.adjust(p, meth, n = 60)) stopifnot(identical(p.adj[,"none"], p), p.adj <= p.adj.60) round(p.adj, 3) ## or a bit nicer: noquote(apply(p.adj, 2, format.pval, digits = 3))

## and a graphic: matplot(p, p.adj, ylab="p.adjust(p, meth)", type = "l", asp = 1, lty = 1:6, main = "P-value adjustments") legend(0.7, 0.6, p.adjust.M, col = 1:6, lty = 1:6) #DOES GRAPHIC

## Can work with NA's: pN <- p; iN <- c(46, 47); pN[iN] <- NA pN.a <- sapply(p.adjust.M, function(meth) p.adjust(pN, meth)) ## The smallest 20 P-values all affected by the NA's : round((pN.a / p.adj)[1:20, ] , 4)