Dynamics and geometry from high dimensional data

Subsampling Methods for Persistent Homology

Frederic Chazal



Computational topology has recently seen an important development toward data analysis, giving birth to Topological Data Analysis. Persistent homology appears as a fundamental tool in this field. It is usually computed from filtrations built on top of data sets sampled from some unknown (metric) space, providing "topological signatures", the so-called persistence diagram, revealing the structure of the underlying space. When the size of the sample is large, direct computation of persistent homology often suffers two issues. First, it becomes prohibitive due to the combinatorial size of the considered filtrations and, second, it appears to be very sensitive to noise and outliers. The goal of this talk is to show that it is possible to overcome these issues computing persistent diagrams, and some related quantities, from several subsamples and combining them in order to efficiently infer robust and relevant topological information. This is a joint work with B. Fasy, F. Lecci, B. Michel, A. Rinaldo, and L. Wasserman.