Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiparameter simplextree #817

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open

Conversation

DavidLapous
Copy link
Member

@DavidLapous DavidLapous commented Feb 15, 2023

This is a basic implementation of multiparameter simplex trees (for python), via a new (python) class : SimplexTreeMulti.
This code is a cleaned version of my multipers repository.
It is basically the same simplextree as the standard python one, but with Filtration_value being a vector instead of a real number.
Key points:

  • All (finitely)-critical multi-filtration, with any (finite) number of parameter can be encoded by this class.
  • For non-finitely critical multi-filtration (e.g. Multicover bifiltration, degree rips), we will need another structure.
  • Edge collapses are implemented for 2-parameter 1-critical filtrations, using filtration_domination.
  • Filtration values can be projected to a grid (for, e.g. more edge collases) on 1-dimensional lines (to compute the standard persistence)
  • A SimplexTreeMulti can be exported as a multi-filtration in a scc file, that can be imported afterward using, e.g., Rivet or mpfree.
  • This class alone is not really useful (yet), as there is no "persistence related function" in this PR, apart from the Euler characteristic.

The attached jupyter notebook showcases the main functions (sorry for the format, github doesn't allow .ipynb files).

EDIT: Forgot the descriptions of files.
New file description and file modifications:

  • Simplextree.h
    • Due to the non-constant size of filtration (for multicritical filtration) I need to store an integer for the number of dimension
  • Simplex_tree_node_explicit_storage.h
    • an initialization was =0 which is not possible with vector-filtrations
  • simplex_tree_multi.pyx
    • Essentially a copy of simplex_tree.pyx. Cannot be shared with this file, unless we merge the two classes (which needs work as the SimplexTree class would hold two different c++ classes).
    • The persistence functions are commented or deleted.
    • Some simplextree functions are re-coded to be compatible with this new structure.
    • Some multiparameter specific function are added (e.g., switch from simplextree to simplextree multi, io)
  • simplex_tree_multi.pyd
    • Essentially a copy of simplex_tree.pxd, with more multi-specific functions
  • Simplex_tree_multi.h
    • Extension of the SimplexTree.h
  • Simplex_tree_interface_multi.h
    • Same format as Simplex_tree_interface.h. Stores the template of simplextree for python of SimplexTreeMulti
  • utils folder
    • C++ functions to help the multi-filtration c++ functions.

Copy link
Member

@mglisse mglisse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not small. Could you give more overview, to help review this? List the files modified explaining what happened to each. If a file is mostly a copy-paste of another file, say it and explain what differs and why we cannot easily share the code.
Random minor comments while trying to find where to start reading.

* Maximum number of simplices to compute persistence is <CODE>std::numeric_limits<std::uint32_t>::max()</CODE>
* (about 4 billions of simplices). */

using value_type = double;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could be set in the SimplexTree Options.

src/python/include/Simplex_tree_multi.h Show resolved Hide resolved

using option_multi = Simplex_tree_options_multidimensional_filtration;
using option_std = Simplex_tree_options_full_featured;
bool operator<(const std::vector<value_type>& v1, const std::vector<value_type>& v2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't legal to redefine operator< for vector<double>. Maybe we could define a wrapper to vector<double> for the multi_filtrations, which could then have whatever semantic we want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what illegal means here; but I can do a "multi_filtration" class which would make that legal.
That will also help for non-finitely-critical filtrations (which cannot be encoded in a vector).

bool operator<(const std::vector<value_type>& v1, const std::vector<value_type>& v2)
{
bool isSame = true;
if (v1.size() != v2.size()) isSame = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to allow comparing filtrations of different size? How is that defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful when comparing a projection of a filtration onto a lower-dimensional multi-filtration, with another lower-dimensional filtration, without doing the projection, but that would be very niche, and wouldn't be implemented like that. We can remove this indeed.

Simplex_tree<option_std> &st = *(Gudhi::Simplex_tree<option_std>*)(splxptr);
Simplex_tree<option_multi> &st_multi = *(Gudhi::Simplex_tree<option_multi>*)(newsplxptr);;
if (dimension <= 0)
{std::cout << "Empty filtration\n"; return ;}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably cerr/clog. Is it a valid use case, in which case the message may be annoying, or an error, in which case we could throw an exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an error. But we don't want to crash the C++ otherwise it will crash the python kernel. I will change that to a cerr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing an exception does not crash anything (but you need to add except + in the .pyx so the exception is converted properly). Or you can return something special and check that in python. Or you can check the dimension in python before calling the C++ function. Or...


euler_chars_type out(point_list.size(), 0.);

auto is_greater = [nparameters](const point_type &a, const point_type &b){ //french greater
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"French" in the sense is_greater_equal?

return simplex_list;
}

euler_chars_type euler_char(const std::vector<std::vector<double>> &point_list){ // TODO multi-critical
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC it can be computed more efficiently when the points are on a grid, which seems like a common use case, to be discussed with Vadim & Olympio.

return number_of_parameters_;
}
private:
unsigned int number_of_parameters_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One extra integer is acceptable, but I wonder if we could make it exist or not depending on the template parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find anywhere else to put it. Its not possible on the SimplexTree options (static). I can put it in the python class, but that would imply passing this integer to c++ each time a c++ function is called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possibility would be [[no_unique_address]] std::conditional_t<Options::something, int, SomeEmptyType> number_of_parameters_; so that depending on the option, it would not take any space. Or have Simplex_tree derive from Maybe_multi_base<SimplexTreeOptions::plouf> which depending on plouf can be empty or contain an int (and provide 2 methods).
Again, that's not fundamental, I just wanted to mention this possibility.

Simplex_tree_node_explicit_storage(Siblings * sib = nullptr,
Filtration_value filtration = 0)
Simplex_tree_node_explicit_storage(Siblings * sib,
Filtration_value filtration) //MULTIPERS : init to 0 not possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Filtration_value filtration = {} work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it will compile with the standard simplextree (Filtration_value is a double)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double d = {}; compiles just fine.

cimport cython
from gudhi import SimplexTree ## Small hack for typing

from filtration_domination import remove_strongly_filtration_dominated, remove_filtration_dominated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency needs to be documented. Does the whole file need to depend on it, or could we make it so only one function depends on it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependence is only needed in the collapse_edge method. I can put this dependence in this method indeed

@MathieuCarriere MathieuCarriere self-assigned this Feb 15, 2023
@DavidLapous
Copy link
Member Author

I also have a question :
For multi-persistence it is often useful (e.g. to compute the eulerchar, the rank invariant) to project the filtration values on a grid. Thus I am sometime playing with multi-simplextrees with filtration values that are coordinates in this grid (that is what the grid_squeeze method does, when coordinate_values = True).

Would you think another class for "coordinate simplextrees" would be useful ? i.e. Filtration_value = vector<unsigned int>

I have specific functions for these kinds of simplextrees (with the coordinate stored in doubles)

@DavidLapous DavidLapous marked this pull request as draft February 15, 2023 17:06
@mglisse
Copy link
Member

mglisse commented Feb 15, 2023

I also have a question : For multi-persistence it is often useful (e.g. to compute the eulerchar, the rank invariant) to project the filtration values on a grid. Thus I am sometime playing with multi-simplextrees with filtration values that are coordinates in this grid (that is what the grid_squeeze method does, when coordinate_values = True).

Would you think another class for "coordinate simplextrees" would be useful ? i.e. Filtration_value = vector<unsigned int>

I have specific functions for these kinds of simplextrees (with the coordinate stored in doubles)

I assume the question is mostly for Python, since in C++ using vector<int> should already work with your branch? We need to understand what the benefits of this integer version would be, so we can balance them against the drawback of yet another type. 53 bits (in a double) is enough to store rather large integers. Is the main advantage to make things cleaner, because some operations only make sense with integers and others only with floats, and this way we could make only the meaningful ones available?

strong:bool
Whether to use strong collapses or collapses (slower, but may remove more edges)
full:bool
Collapses the maximum number of edges if true, i.e., will do at most 100 strong collapses and 100 non-strong collapses afterward.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what that means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full flag is meant to do all possible collapses, i.e., doing all possible strong collapses, and then all "usual" collapses (definitions here, definition 2.1). As multiparameter persistence algorithms tend to be very slow, doing all possible collapses can be worth it.
Another reason is that calling this method calls the method get_edge_list, and if the user wants to do strong and weak collapses, get_edge_list will get called only once.

Comment on lines 666 to 667
print(f"The file {path} already exists. Use the `overwrite` flag if you want to overwrite.")
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising an exception is a nice way to signal a problem in Python.



// ######################## MULTIPERS STUFF
void reset_keys(){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"reset" sounds like it is going to give all of them their default value, 0. Maybe use a name that reflects the fact that you enumerate/number the simplices?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to a gudhi bug (if it isn't already fixed); the keys of the simplicies of a python simplextree are all 0, so I have to re-initialize them using this function.
I didn't find a nice name, but set_keys_to_enumerate be OK ?
I will remove this functions once this "gudhi python bug" is fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a bug. The name "key" is misleading, it is arbitrary data. The persistence algorithm does start by assigning an index as key, but then changes it during its execution.
Yes, set_keys_to_enumerate is much clearer, thanks.

simplex_list.push_back(simplex);
}
}
simplex_list.shrink_to_fit();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of shrinking here? This will be copied to a Python object and destroyed quickly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I supposed that the copy to python was maybe smaller this way. If it's not necessary, I can remove that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, when converting a std::vector to python, cython creates a list with PyList_New(<Py_ssize_t> v.size()), it never even looks at the capacity of the vector.

Comment on lines 312 to 314
std::vector<double> new_filtration_value = Base::filtration(SimplexHandle);
new_filtration_value.resize(num);
Base::assign_filtration(SimplexHandle, new_filtration_value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it would be useful to be able to get a reference to the filtration, so we could modify it in place instead of copying twice. The interface was designed with double in mind, not vector...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I didn't wanted to change already existing files too much, and keep it "simple", so I preferred to do this copy for now. I will have to change that at some point

void expansion(int max_dim) nogil except +
void remove_maximal_simplex(simplex_type simplex) nogil
bool prune_above_filtration(filtration_type filtration) nogil
# bool make_filtration_non_decreasing() nogil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised how few modifications there are in Simplex_tree.h. Is make_filtration_non_decreasing almost the only function that does not do what you want? Does expansion compute the max of filtration vectors properly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all function that are not related to either persistence, or relying on filtration comparison, usually work without any issue. I'm not sure the prune_above_filtration works properly. The expansion works for the simplex part, but it calls the make_filtration_non_decreasing afterwards, that is rewritten in the simplex_tree_multi.pyx.
The code for make_filtration_non_decreasing is in python for the moment, so it's super slow.

}

euler_chars_type euler_char(const std::vector<std::vector<double>> &point_list){ // TODO multi-critical
const unsigned int npts = point_list.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(picking one place randomly)
Using an unsigned type to encode the fact that a number cannot be negative is not super popular these days, except to match an existing interface. For npts it would be natural to use std::size_t, what size() returns. For nparameters which is known to be small, either size_t because that's what size returns, or int which is the default type for small integers. unsigned int tends to be only used when doing bit operations, both because its arithmetic is more error-prone than the corresponding signed type, and because it is often harder to optimize for compilers. Although if you like unsigned int we can live with it, we already have quite a few in Gudhi, I just wanted to make sure you were aware of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will change that to an int; with all of the other occurrences of unsigned int, as I'm not doing bit operations.

@mglisse
Copy link
Member

mglisse commented Feb 15, 2023

For multi-persistence it is often useful (e.g. to compute the eulerchar, the rank invariant) to project the filtration values on a grid. Thus I am sometime playing with multi-simplextrees with filtration values that are coordinates in this grid (that is what the grid_squeeze method does, when coordinate_values = True).

I don't know about the rank invariant, but for the Euler characteristic surface, I don't think you want to actually store the complex with rounded filtration values, you compute the rounded values on the fly and forget them. One algorithm I imagine is that you have a table of 0s matching the grid, then for each simplex you find the corresponding cell in the array (round the filtration values up) and add ±1 to this cell depending on the dimension, and finally you compute a cumulative sum in each dimension on the array.

simplex_list.push_back(simplex);
}
}
simplex_list.shrink_to_fit();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, when converting a std::vector to python, cython creates a list with PyList_New(<Py_ssize_t> v.size()), it never even looks at the capacity of the vector.

Comment on lines 694 to 695
def get_simplices_of_dimension(self, dim:int):
return self.get_ptr().get_simplices_of_dimension(dim)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to return a List[List[int]]. I think there are 2 nicer choices:

  • a generator, like get_simplices, so it does not use any memory
  • a numpy array, if we want to store it explicitly, since all the simplices have the same dimension

:param simplex: The N-simplex, represented by a list of vertex.
:type simplex: list of int
:returns: The simplicial complex filtration value.
:rtype: float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, no, not a float, that's the whole point of this class 😜

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have to update the doc 😉

Comment on lines 142 to 145
# filtration_vector = self.get_ptr().simplex_filtration(simplex)
# filtrations = np.array_split(filtration_vector, len(filtration_vector) // self.num_parameters)
# return filtrations.squeeze() # If 1 critical returns only 1 filtration
return self.get_ptr().simplex_filtration(simplex) # Same output type, better
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like reshape((-1, num_parameters)), it makes it clear that there are multiple critical points.
squeeze has the problem that you end up having to test the size of the output for every call. If the whole filtration is 1-critical, it may be convenient to avoid the extra dimension, but I don't think it is worth the resulting inconsistency.

src/python/gudhi/simplex_tree_multi.pyx Show resolved Hide resolved
Algorithmica, pages 1–22, 2014.

This class is a filtered, with keys, and non contiguous vertices version
of the simplex tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this doc gets compiled by sphinx currently, and it does not always reflect the multifiltration.

@DavidLapous
Copy link
Member Author

Thanks Marc for the review,

  • I've made a class for finitely critical filtrations, I had to do some "jardinage" because I couldn't get cython & c++ to cast a class to it's inherited class (multi_filtrations are vectors of double)
  • I removed the (mostly) unused filed in the utils folder and renamed it
  • and some cleanup to comply with your comments.

@DavidLapous
Copy link
Member Author

DavidLapous commented Sep 6, 2023

Changes since last update :

  • Merged upstream
  • changed Filtration_value io in simplextree to const Filtration_value&.
    • as far as I'm aware, const stuff& are dealt as const rvalues by the compiler so this shouldn't impact performance when Filtration_value=double (although I didn't test it).
    • This allows copying and modifying filtration values coordinates, without copy.
    • The default values is changed from 0 to a new static Filtration_value in the simplextree structure : inf_ = std::numeric_limits<Filtration_value>::infinity();. This default value can be discussed (and changed if Filtration_value doesn't have an infinity), but 0 isn't natural for me, aswell as the default behavior of insert, which changes the values of the faces of the simplex inserted.
  • edge collapse code is now in a separated python file. Performance-wise, this shouldn't matter much as the number of collapse is usually << 100.
  • The Simplex_tree_interface_multi is now a child of Simplex_tree_interface<options_multi> for easier maintentance. Now we only need to disable non-compatible functions and add the multiparameter-specific changes.
  • The multiparameter filtration values are passed as numpy views from the c++ simplextree to the python interface. This allows modifying filtration values coordinate-wise, without copies.
  • The simplextree coarsening method (simplextree.grid_squeeze) has more options. It basically calls the _reduce_grid method, which takes as an input filtration values (here the filtation values of the simplextree) and generate a filtration-grid depending on a strategy (e.g., regular, quantile, ...) given by the user, and then pushes the simplextree's filtration values to this grid.

A notebook showcasing the interface can be found here.

Also note that the gudhi 1-parameter simplextrees are retrieved from python using this kind of trick (Simplex_tree_multi.h, line 57)

template<class simplextreeinterface>
simplextreeinterface& get_simplextree_from_pointer(const uintptr_t splxptr){ //DANGER
	simplextreeinterface &st = *(simplextreeinterface*)(splxptr); 
	return st;
}

so if the python version wasn't compiled with the same version of the Simplex_tree.h file, one may encounter segfaults.

EDIT : notebook link update

@DavidLapous DavidLapous marked this pull request as ready for review September 6, 2023 09:58
@mglisse
Copy link
Member

mglisse commented Sep 12, 2023

  • The default values is changed from 0 to a new static Filtration_value in the simplextree structure : inf_ = std::numeric_limits<Filtration_value>::infinity();. This default value can be discussed (and changed if Filtration_value doesn't have an infinity), but 0 isn't natural for me, aswell as the default behavior of insert, which changes the values of the faces of the simplex inserted.

This kind of change should really be handled as a separate PR, otherwise it just makes everything harder to review.
Are those changes you need for the multi-parameter code, or just preferences? I haven't looked at the code, but we should be careful with changes that are likely to break users' code.

@DavidLapous
Copy link
Member Author

These changes should all be behind a if constexpr (Options::is_multiparameter) flag. I still need to fix this default value to use the const Filtration_value& to prevent copies; I can change this default value to {} and set the default value in python instead of c++ to have a more consistent behavior.

@DavidLapous
Copy link
Member Author

After re-reading the code in the PR, I noticed that my previous statements were false :

  • The default value is always {} and only the python interface has different values.
  • The static simplextree variable member inf_ is only meant to deal with the filtration which has to return inf on the null_simplex.

@DavidLapous
Copy link
Member Author

I had to add a static variable null_value to the dummy filtration, as filtration values are passed via reference in this pr.

I'm also not sure that the python interface will be able to run with python<3.10 and cython<3, as I'm using some python 3.10 features (e.g. function typing) and I had to rewrite some stuff for cython3. I guess that's why the python_without_cgal build fails.

@@ -31,5 +31,7 @@ struct SimplexTreeOptions {
static const bool link_nodes_by_label;
/// If true, Simplex_handle will not be invalidated after insertions or removals.
static const bool stable_simplex_handles;

static const bool is_multi_parameter;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the attribute has to be described here like the others (the concepts are used for documentation).

Filtration_value filtration() const { return filt_; }
Filtration_simplex_base_real() : filt_{} {}
void assign_filtration(const Filtration_value& f) { filt_ = f; }
// Filtration_value filtration() const { return filt_; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer if you remove all the unused code you commented (not only here, there is commented code a bit everywhere).

private:
Filtration_value filt_;
};
struct Filtration_simplex_base_dummy {
Filtration_simplex_base_dummy() {}
void assign_filtration(Filtration_value GUDHI_CHECK_code(f)) { GUDHI_CHECK(f == 0, "filtration value specified for a complex that does not store them"); }
Filtration_value filtration() const { return 0; }
const Filtration_value& filtration() const { return null_value; }
static constexpr Filtration_value null_value={};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be more part of another PR, but now that we use C++ standard 17, we could stop calling filtration() with the dummy and use if constexpr instead. This makes conceptually more sense for me and it would allow to clean out the dummy completely (and therefore make it a real dummy). For this PR, it would mean that we don't need to add the null_value to get around the problem that dummy::filtration() is literally called and therefore has to return something.
@mglisse @VincentRouvreau, what do you think? Should I make an issue?

return inf_;
}
}
static Filtration_value& filtration_mutable(Simplex_handle sh){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we could not just keep the version with the non const, as the validity of the filtration in the simplex tree is already responsibility of the user. This would avoid having to find a different name when it does exactly the same thing than filtration(). Or could there be performance issues when using non const references for numerical values (i.e., when the simplex tree is not multi)?
If it is the case, an ugly way to solve this would be to return a std::conditional to switch between Filtration_value& and Filtration_value depending on the value of is_multi_parameter.

if (filtration(simplex_one) > filt) {
assign_filtration(simplex_one, filt);
if (!(filtration(simplex_one) <= filt)) { // TODO : For multipersistence, it's not clear what should be the default, especially for multicritical filtrations
if constexpr (SimplexTreeOptions::is_multi_parameter){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section seems to be work in progress?

Copy link
Member Author

@DavidLapous DavidLapous Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, as multiparameter filtrations have to be done carefully, I don't want the insertion to change the filtration values of the other simplices. We could define a behavior based on if the filtration value is comparable to the previous filtration value and create a multicritical filtration / change the values of the other simplices (that's what the if is doing), but I'm not sure that someone would want to deal with this complicated behavior; doing this by hand seems fine. I thus assume here that the user is smart. I will just remove these comments.

@@ -1603,12 +1633,22 @@ class Simplex_tree {
if (dim == 0) return;
// Find the maximum filtration value in the border
Boundary_simplex_range&& boundary = boundary_simplex_range(sh);
Boundary_simplex_iterator max_border = std::max_element(std::begin(boundary), std::end(boundary),
typename SimplexTreeOptions::Filtration_value max_filt_border_value;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typename SimplexTreeOptions::Filtration_value can simply be written as Filtration_value. Again at line 1639.

Boundary_simplex_iterator max_border = std::max_element(std::begin(boundary), std::end(boundary),
typename SimplexTreeOptions::Filtration_value max_filt_border_value;
if constexpr (SimplexTreeOptions::is_multi_parameter){
// in that case, we assume that Filtration_value has a `push_to` member to handle this.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super specific and should be mentioned in the documentation and the concepts. I would also move finitely_critical_filtrations.h to here (inside include/gudhi/Simplex_tree folder) and mention that it realizes the concept for the multi simplex tree filtration values to make it accessible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a possibility, but I'm only using python, and I will not maintain / deal with a c++ interface that I don't use.

Copy link
Collaborator

@hschreiber hschreiber Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not about making a C++ interface. Just move the file, include it in Simplex_tree.h and add something like:

struct Simplex_tree_options_multi_parameter {
  typedef linear_indexing_tag Indexing_tag;
  typedef int Vertex_handle;
  typedef Finitely_critical_multi_filtration<float> Filtration_value;
  typedef std::uint32_t Simplex_key;
  static const bool store_key = true;
  static const bool store_filtration = true;
  static const bool contiguous_vertices = false;
  static const bool link_nodes_by_label = false;
  static const bool stable_simplex_handles = false;
  static const bool is_multi_parameter = true;
};

at the end of the file with the other options instead of having it in /python/include/Simplex_tree_multi.h. That is the only change. Then for the documentation, in the concept of the options (concept/SimplexTreeOptions.h) you mention the need of push_to and that you can find an example of implementation as Finitely_critical_multi_filtration<T>.

The difference is that it will make the tree usable for C++ users without you having to change much or to maintain anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hschreiber IIUC you are suggesting defining this Simplex_tree_options_multi_parameter in (a file included by) Simplex_tree.h? We need to have a conversation about #905, but in my opinion we should define very few of those by default and Simplex_tree_options_fast_persistence may already have been a mistake. There is no particular reason why someone using multiparameters should particularly want float vs double for instance. Having it in python, as "what we use for SimplexTreeMulti", makes sense to me. On the other hand, maybe more functions could be implemented more generically, to work with any option that has is_multi_parameter==true, and those could end up in the C++ part, although that's more work.

Copy link
Collaborator

@hschreiber hschreiber Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mglisse Not really, it was just to illustrate how we would use it. If Filtration_value needs a special method like push_to, we cannot just let it hang there without an explanation for the user, in particular if code was already written. We can also put this just in the tests or in an example, but at least in the concept, the implementation of Finitely_critical_multi_filtration<T> should be mentioned and not hidden away in the python folder.

@@ -2239,6 +2279,18 @@ class Simplex_tree {
/** \brief Upper bound on the dimension of the simplicial complex.*/
int dimension_;
bool dimension_to_be_lowered_ = false;

//MULTIPERS STUFF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with the option strategy of the simplex tree so far, this (except for the inf_) should be put in a mixin with a dummy like struct Filtration_simplex_base_real and struct Filtration_simplex_base_dummy.

@@ -2290,6 +2342,7 @@ struct Simplex_tree_options_full_featured {
static const bool contiguous_vertices = false;
static const bool link_nodes_by_label = false;
static const bool stable_simplex_handles = false;
static const bool is_multi_parameter = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it not be has_multi_parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either has_multiple_parameters or is_multi_parameter. I prefer the second one.

@@ -41,6 +41,7 @@ struct Simplex_tree_options_stable_simplex_handles {
static const bool contiguous_vertices = false;
static const bool link_nodes_by_label = true;
static const bool stable_simplex_handles = true;
static const bool is_multi_parameter = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to also add some tests with is_multi_parameter set to true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants