ADVERTISEMENT

If you are seeing this message, you may be experiencing temporary network problems. Please wait a few minutes and refresh the page. If the problem persists, you may wish to report it to your local Network Manager.

It is also possible that your web browser is not configured or not able to display style sheets. In this case, although the visual presentation will be degraded, the site should continue to be functional. We recommend using the latest version of Microsoft or Mozilla web browser to help minimise these problems.

Wiley InterScience

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Volume 63 Issue 2, Pages 411 - 423

Published Online: 6 Jan 2002

© 2010 The Royal Statistical Society and Blackwell Publishing Ltd



< Previous Abstract

Save Article to My Profile      Download Citation      Request Permissions

Abstract |  Full Text: PDF (Size: 281K)  | Related Articles | Citation Tracking

Estimating the number of clusters in a data set via the gap statistic
Robert Tibshirani , Guenther Walther & Trevor Hastie
  1 Stanford University, USA
Correspondence to: Robert Tibshirani
Copyright 2001 Royal Statistical Society
KEYWORDS
Clustering • Groups • Hierarchy • Uniform distribution

ABSTRACT

We propose a method (the 'gap statistic') for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. K-means or hierarchical), comparing the change in within-cluster dispersion with that expected under an appropriate reference null distribution. Some theory is developed for the proposal and a simulation study shows that the gap statistic usually outperforms other methods that have been proposed in the literature.


DIGITAL OBJECT IDENTIFIER (DOI)
10.1111/1467-9868.00293 About DOI

Related Articles

  • Find other articles like this in Wiley InterScience
  • Find articles in Wiley InterScience written by any of the authors

Wiley InterScience is a member of CrossRef.

Cross Ref Member


Also of Interest

Statistics

Wiley-Blackwell is the largest publisher of society-based statistics journals and No. 1 in terms of quality and international scope.

Wiley-Blackwell publishes 19 statistics journals and is now the top publisher of Thomson Reuters ranked statistics journals.

Discover more about the statistics portfolio

Hot Papers
RSS

Journal of the Royal Statistical Society

See the Papers attracting early citation:

Series A: Statistics in Society
A re-evaluation of random-effects meta-analysis

Series B: Statistical Methodology
Testing for lack of fit in inverse regression—with applications to biophotonic imaging

Series C: Applied Statistics
A multifaceted sensitivity analysis of the Slovenian public opinion survey data

Announcing
SIGN

Significance

2010 Crystal Ball Competition

Try to forecast the results of 10 different events, some sporting, some cultural, some just odd, that will take place between May and July 2010.
Cash prizes and books for winners.

Take part

Check out the rules

Have Fun!