Project overview

In recent years, there has been some interest in applying ideas from algebraic topology to problems in data analysis. One of the most common tools in this area is the notion of persistent homology. Here, the data to be analyzed is a finite metric space $(X)$. Starting with $(X)$ and a threshold distance $(d)$, we can form a topological space $(X_d)$; if $(d_1 < d_2)$ then $(X_{d_1} \subset X_{d_2})$. By analyzing the groups $(H_*(X_d))$ and their relation under the natural maps $(H_*(X_{d_1}) \to H_*(X_{d_2}))$ one hopes to extract meaningful statistical information about the set $(X)$. Statisticians are just getting started to use these ideas in data analysis, in applications as diverse as neuroscience and cancer.

Natural questions that arise are, for instance, what sort of patterns this process turns up when there is no meaningful information to be found; i.e. when X is chosen at random. In the simplest case (persistent $H_0$), this question amounts to studying the number of components of an Erdös-Renyi random graph. This problem has been extensively studied by graph theorists. It is also interesting to explore the results which these methods will return when they are applied to random datasets. A variety of approaches to this topic are possible; both theoretical, using the theory of random graphs and random simplicial complexes, and experimental, using computer simulation.

[1] B. Bollobas, Random Graphs, CUP, 2001.

[2] Chung, M.K., Hanson, J.L., Ye, J., Davidson, R.J. Pollak, S.D. Persistent Homology in Sparse Regression and Its Application to Brain Morphometry. IEEE Transactions on Medical Imaging, 34:1928-1939, 2015

[3] R. Ghrist, Barcodes: the persistent topology of data, AMS Bulletin 45 (2008), 61-75.

[4] M. Kahle, Topology of random simplicial complexes: a survey. Algebraic Topology: applications and new directions, 201-221, Contemp. Math. 620, AMS, 2014. arXiv:1301.7165

[5] S. Janson, T. Luczak, and A. Rucinski, Random Graphs, Wiley, 2000.

Ready for a discussion on how academia and industry can better approach the challenges of data science, and what th… https://t.co/isaiyGB7IQ
View on Twitter