|
F. Chazal,
L. J. Guibas, S. Y. Oudot, P. Skraba. Persistence-Based
Clustering in Riemannian Manifolds. Journal of the ACM, volume 60, issue 6, article 41.
Abstract:
We present a novel clustering algorithm that combines a mode-seeking
phase with a cluster merging phase. While mode detection is performed
by a standard graph-based hill-climbing scheme, the novelty of our
approach resides in its use of topological persistence theory to guide
the merges between clusters. An interesting feature of our algorithm
is to provide additional feedback in the form of a finite set of
points in the plane, called a {\em persistence diagram}, which
provably reflects the prominence of each of the modes of the
density. Such feedback is an invaluable tool in practice, as it
enables the user to determine a set of parameter values that will make
the algorithm compute a relevant clustering on the next run.
In terms of generality, our approach requires the sole knowledge of
(approximate) pairwise distances between the data points, as well as
of rough estimates of the density at these points. It is therefore
virtually applicable in any arbitrary metric space. In the meantime,
its complexity remains reasonable: although the size of the input
distance matrix may be up to quadratic in the number of data points, a
careful implementation only uses a linear amount of main memory and
barely takes more time to run than the one spent reading the input.
Taking advantage of recent advances in topological persistence theory,
we are able to give a theoretically sound notion of what the "correct"
number k of clusters is, and to prove that under mild sampling
conditions and a relevant choice of parameters (made possible in
practice by the persistence diagram) our clustering scheme computes a
set of k clusters whose spatial locations are bound to the ones of the
basins of attraction of the peaks of the density. These guarantess
hold in a large variety of contexts, including when data points are
distributed along some unknown Riemannian manifold.
Bibtex:
@article{cgos-pbc-13,
keywords = "journal",
author = {Chazal, Fr{\'e}d{\'e}ric and Guibas, Leonidas J. and Oudot, Steve and Skraba, Primoz},
title = {Persistence-Based Clustering in {R}iemannian Manifolds},
journal = {Journal of the ACM},
issue_date = {November 2013},
volume = {60},
number = {6},
year = {2013},
pages = {1--38},
articleno = {41}
}
@inproceedings{cgos-pbc-11
, author = "F. Chazal and L. J. Guibas and S. Y. Oudot and P. Skraba"
, title = "Persistence-Based Clustering in Riemannian Manifolds"
, booktitle = "Proc. 27th Annu. ACM Sympos. on Comput. Geom."
, pages = "97--106"
, month = "June"
, year = 2011
}
|
 |