Pooled ANOVA

TitlePooled ANOVA
Publication TypeJournal Articles
Year of Publication2008
AuthorsLast M, Luta G, Orso A, Porter A, Young S
JournalComputational Statistics & Data Analysis
Pagination5215 - 5228
Date Published2008/08/15/
ISBN Number0167-9473

We introduce Pooled ANOVA, a greedy algorithm to sequentially select the rare important factors from a large set of factors. Problems such as computer simulations and software performance tuning involve a large number of factors, few of which have an important effect on the outcome or performance measure. We pool multiple factors together, and test the pool for significance. If the pool has a significant effect we retain the factors for deconfounding. If not, we either declare that none of the factors are important, or retain them for follow-up decoding, depending on our assumptions and stage of testing. The sparser important factors are, the bigger the savings. Pooled ANOVA requires fewer assumptions than other, similar methods (e.g. sequential bifurcation), such as not requiring all important effects to have the same sign. We demonstrate savings of 25%-35% when compared to a conventional ANOVA, and also the ability to work in a setting where Sequential Bifurcation fails.