# TDA: TP 1

For these exercises, we will use the library [Gudhi](https://gudhi.inria.fr/). You can install it with `conda` or `pip` (on new Apple laptops, only `conda` is available). Let us check that you have a recent version (current is 3.6.0). If for some reason you end up with an old version, you can try to install `gudhi=3.6.0`.

In [None]:
import gudhi as gd
print(gd.__version__)

In [None]:
# We will need a few other libraries
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm
# To get nice interactive plots. May require installing ipympl and restarting the kernel.
# Other possibilities than notebook include qt, osx, tk... If nothing works, just remove this line
%matplotlib notebook

The documentation for the Python interface of Gudhi is [here](https://gudhi.inria.fr/python/latest/). There are also some [tutorials](https://github.com/GUDHI/TDA-tutorial), an [issue tracker](https://github.com/GUDHI/gudhi-devel/issues), a [mailing-list](https://sympa.inria.fr/sympa/arc/gudhi-users/), etc.

## Functions, cubical complex
### Volcano
Let us first define a function from $\mathbb{R}^2$ to $\mathbb{R}$.

In [None]:
grid = np.linspace(-1,1,100)
gridx = grid[:,np.newaxis]
gridy = grid[np.newaxis,:]
sq = - gridx**2 - gridy**2
volcano = np.exp(sq) - 0.7 * np.exp(sq*4)
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.plot_surface(gridx, gridy, volcano, cmap=matplotlib.cm.coolwarm)

Now we have a function, we can build a filtered [cubical complex](https://gudhi.inria.fr/python/latest/cubical_complex_ref.html) from it and compute the persistence diagram of its **sub**levelsets.

In [None]:
cplx = gd.CubicalComplex(top_dimensional_cells=volcano)
diag = cplx.persistence()
gd.plot_persistence_barcode(diag, legend=True)
gd.plot_persistence_diagram(diag, legend=True)
# red is dimension 0, blue is dimension 1

(The name top_dimensional_cells is because gudhi does the opposite of what we saw in class, it gives the grid values to top-dimensional cells and deduces values for other cells, instead of giving values to vertices and deducing values for other cells. The difference is not important here.)

Compare the 2 plots. Why do we only see 3 red points but 5 red bars? Why are more than 60 of the bars invisible? You can print `diag` to help, the format is a list of `(dimension, (birth, death))`.

Now compute the persistence diagram of the **super**levelsets of this function (hint: there is no direct function to that, only sublevelsets).

What happened to the point corresponding to the crater of the volcano between the sub- and super-levelsets?

### 1d function
In class, we looked at the function $f: t \mapsto sin(t)+sin(2t)$

Build a table with 200 values of f between 0 and $2\pi$. Plot the function, compute the persistence diagram of its sublevelsets, and draw its persistence diagram.

What happens if we consider a longer range instead of `[0, 2π]`?

We will reuse this function later.

## Point sets
### Torus
As in the class, we first consider a set of points regularly spaced along a curve drawn on a torus. For simplicity, we embed this torus in $\mathbb{R}^4$.

In [None]:
a = np.linspace(0, 2*np.pi, 50, False)
b = np.stack((np.cos(a),np.sin(a),np.cos(5*a),np.sin(5*a)),axis=-1)
# Plot the points on the unwrapped torus
plt.figure()
plt.scatter(a, 5*a % (2*np.pi))
plt.show()

We now compute the persistence of the Čech filtration of these points. We actually use an α-complex for that. Notice that the data-structure used to represent a simplicial complex in Gudhi is called SimplexTree.

In [None]:
cplx = gd.AlphaComplex(points=b).create_simplex_tree()
p = cplx.persistence()
# print only the most persistent features
print([(dim,(birth,death)) for (dim,(birth,death)) in p if death - birth > .1])
gd.plot_persistence_diagram(p, legend=True)

Can you recognize the features of a torus here? Is there anything extra?

Now try doing the same computation, but instead of using an α-complex we will approximate the Čech complex with a Rips complex ([doc](https://gudhi.inria.fr/python/latest/rips_complex_ref.html)). What happens if you only change the name of the class? While the α-complex naturally has the ambient dimension, the Rips complex may be built up to an arbitrary dimension, so you need to specify a `max_dimension`. Can you still see the 3-sphere? What happens if you specify a larger dimension, say `max_dimension=5`?

Once the diagram has been computed with a call to [`persistence()`](https://gudhi.inria.fr/python/latest/simplex_tree_ref.html#gudhi.SimplexTree.persistence) (or just [`compute_persistence()`](https://gudhi.inria.fr/python/latest/simplex_tree_ref.html#gudhi.SimplexTree.compute_persistence) if you do not need the diagram in the form `persistence()` returns), you can get the points of the persistence diagram of dimension i as a convenient (n,2) numpy array:

In [None]:
cplx.persistence_intervals_in_dimension(1)

### Time series
Let us go back to the function $f$ define above. We saw in class that we can turn ([doc](https://gudhi.inria.fr/python/latest/point_cloud.html#time-delay-embedding)) it into a 2d point cloud with nice loops, so we try that.

In [None]:
from gudhi.point_cloud.timedelay import TimeDelayEmbedding
f2 = TimeDelayEmbedding(dim=2)(f)
plt.figure()
plt.scatter(f2[:,0],f2[:,1])
plt.show()

Hmm, those loops are way too squished to see anything. Can you fix it?

Once the figure looks nice, compute or approximate the persistence diagram of the Čech filtration of this point set (here you have several choices). Dimension 1 seems the most relevant. Does the result have any connection with the diagram of the sublevelset computed at the beginning?

## Manually creating a filtration
If you are not satisfied with the existing filtrations (AlphaComplex, RipsComplex, etc), you can also construct a simplicial complex by hand, specifying each simplex and its filtration value.

Create an empty simplicial complex ([doc](https://gudhi.inria.fr/python/latest/simplex_tree_ref.html)) and `insert` a few simplices. You can see the list of simplices in your complex using `list(cplx.get_simplices())`. Notice that when you `insert` a simplex with filtration value `f`, the library helpfully ensures that the faces of this simplex are also present with a filtration value at most `f`. The function `assign_filtration` can be useful to change the filtration value of a simplex, but it does not provide this safety net.

The following code constructs a small torus.
![TorusTriangle.png](attachment:TorusTriangle.png)
In this example, we do not care about filtration values (everything is inserted at time 0), only the topology of the full complex.

In [None]:
cplx = gd.SimplexTree()
for i in range(3):
 ii = (i + 1) % 3
 for j in range(3):
 jj = (j + 1) % 3
 cplx.insert([3*i+j, 3*ii+jj, 3*ii+j])
 cplx.insert([3*i+j, 3*ii+jj, 3*i+jj])
print(list(cplx.get_simplices()))
# Since homology in dim p uses simplices of dim p and p+1, persistence is by default computed up to dim-1.
cplx.compute_persistence(persistence_dim_max=True)
print("The number of holes of dimension [0, 1, 2] are", cplx.betti_numbers())

By gluing some "spheres" and "circles" (the boundary of a simplex for instance is a *topological* sphere), can you build a simplicial complex that has the same homology as the torus but looks nothing like a torus?

## Rips filtration and higher dimensions
In class, we saw that the Rips complex is not embedded, it is defined as an abstract simplicial complex. Here we will see that the higher dimensional simplices that appear can actually generate some topology.

Generate 20 points evenly spaced on a circle in ℝ². Build the Rips filtration on those points up to dimension 8, and plot its persistence diagram. What do you notice? You can try again with 21, 22, 23 or 24 points. You probably shouldn't try to increase the dimension or the number of points too much as it will quickly fill the memory on your computer.

## Distance and stability
### Point sets
Let us consider again the point set on a curve on a torus in $\mathbb{R}^4$, and compute the persistence diagram of dimension 1 of its Rips filtration. Now perturb each point randomly by a small noise, and compute the persistence diagram of dimension 1 of these new points. Compute the [bottleneck distance](https://gudhi.inria.fr/python/latest/bottleneck_distance_user.html#gudhi.bottleneck_distance) between these diagrams. Retry it a few times, maybe also with dimension 0 or 2. Can you confirm the stability result?

Now do the same experiment with the alpha-complex. Note that there is an even worse trap than for the Rips.

In class, we said that Rips and Čech filtrations have close persistence diagrams in log-scale. Can you illustrate that on this dataset?

### Functions
Similarly, illustrate the stability property on one of the functions seen above (either the volcano, or the curve).