-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Archipelago - dict transformer for vectorizing persistence diagrams #1017
Closed
Closed
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
de408e9
add archipelago class
927c789
give default quantiser sklearn.KMeans to Atol method
3e95419
homology_dimensions and settlers for Archipelago class
3bc232a
n_init parameter for sklearn KMeans
431a692
archipelago island
aaef59f
mature version compatible with gudhi.representations.vector_methods
adc8668
Atol try/catch
4685748
fix docstrings
2cf60e4
typo
6524320
Merge branch 'GUDHI:master' into archipelago
martinroyer 5561d09
docstring correct
5367ea1
refactor removing input preprocessing, instead we take raw dgm format…
95bd156
prints
b4de687
Revert try/catch optimizer fit in Atol
martinroyer 89be488
fix set_output from sklearn so as to return pandas without importing …
f5dc92d
default KMeans parameter
6bfb164
change confusing Atol __call__ function
23a5e47
define get_feature_names_out for Atol
a59af1b
test fixes
9fadd61
hopefully fix atol test following `n_init="auto"` in KMeans
0b1c6b9
revert value changes to doc
aa0d3cb
updated docstring
4afb0ef
tentative change n_init value for test compatibility 3.7
2df14d9
remove try except
f01ac19
call vectorizer get_geature_names_out if exists
c43b190
more sklearn logic
b74dcae
atol fixes:
26eef83
add test for representations interface fit/transform/...
5d8af99
remove archipelago
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,23 +118,23 @@ def test_atol_doc(): | |
b = np.array([[4, 2, 0], [4, 4, 0], [4, 0, 2]]) | ||
c = np.array([[3, 2, -1], [1, 2, -1]]) | ||
|
||
atol_vectoriser = Atol(quantiser=KMeans(n_clusters=2, random_state=202006)) | ||
atol_vectoriser = Atol(quantiser=KMeans(n_clusters=2, random_state=202006, n_init=10)) | ||
# Atol will do | ||
# X = np.concatenate([a,b,c]) | ||
# kmeans = KMeans(n_clusters=2, random_state=202006).fit(X) | ||
# kmeans = KMeans(n_clusters=2, random_state=202006, n_init=10).fit(X) | ||
# kmeans.labels_ will be : array([1, 0, 1, 0, 0, 1, 0, 0]) | ||
first_cluster = np.asarray([a[0], a[2], b[2]]) | ||
second_cluster = np.asarray([a[1], b[0], b[2], c[0], c[1]]) | ||
second_cluster = np.asarray([a[1], b[0], b[1], c[0], c[1]]) | ||
Comment on lines
-121
to
+127
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @VincentRouvreau I believe this is what you need part2 |
||
|
||
# Check the center of the first_cluster and second_cluster are in Atol centers | ||
centers = atol_vectoriser.fit(X=[a, b, c]).centers | ||
np.isclose(centers, first_cluster.mean(axis=0)).all(1).any() | ||
np.isclose(centers, second_cluster.mean(axis=0)).all(1).any() | ||
|
||
vectorization = atol_vectoriser.transform(X=[a, b, c]) | ||
assert np.allclose(vectorization[0], atol_vectoriser(a)) | ||
assert np.allclose(vectorization[1], atol_vectoriser(b)) | ||
assert np.allclose(vectorization[2], atol_vectoriser(c)) | ||
assert np.allclose(vectorization[0], atol_vectoriser._transform(a)) | ||
assert np.allclose(vectorization[1], atol_vectoriser._transform(b)) | ||
assert np.allclose(vectorization[2], atol_vectoriser._transform(c)) | ||
|
||
|
||
def test_dummy_atol(): | ||
|
@@ -145,12 +145,12 @@ def test_dummy_atol(): | |
for weighting_method in ["cloud", "iidproba"]: | ||
for contrast in ["gaussian", "laplacian", "indicator"]: | ||
atol_vectoriser = Atol( | ||
quantiser=KMeans(n_clusters=1, random_state=202006), | ||
quantiser=KMeans(n_clusters=1, random_state=202006, n_init=10), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @VincentRouvreau I believe this is what you need part3/3 |
||
weighting_method=weighting_method, | ||
contrast=contrast, | ||
) | ||
atol_vectoriser.fit([a, b, c]) | ||
atol_vectoriser(a) | ||
atol_vectoriser._transform(a) | ||
atol_vectoriser.transform(X=[a, b, c]) | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
from copy import deepcopy | ||
import numpy as np | ||
|
||
from sklearn.cluster import KMeans | ||
|
||
from gudhi.representations import (Atol, Landscape, Silhouette, BettiCurve, ComplexPolynomial, \ | ||
TopologicalVector, PersistenceImage, Entropy) | ||
|
||
vectorizers = { | ||
"atol": Atol(quantiser=KMeans(n_clusters=2, random_state=202312, n_init="auto")), | ||
# "betti": BettiCurve(), | ||
} | ||
|
||
diag1 = [np.array([[0., np.inf], | ||
[0., 8.94427191], | ||
[0., 7.28010989], | ||
[0., 6.08276253], | ||
[0., 5.83095189], | ||
[0., 5.38516481], | ||
[0., 5.]]), | ||
np.array([[11., np.inf], | ||
[6.32455532, 6.70820393]]), | ||
np.empty(shape=[0, 2])] | ||
|
||
diag2 = [np.array([[0., np.inf], | ||
[0., 8.94427191], | ||
[0., 7.28010989], | ||
[0., 6.08276253], | ||
[0., 5.83095189], | ||
[0., 5.38516481], | ||
[0., 5.]]), | ||
np.array([[11., np.inf], | ||
[6.32455532, 6.70820393]]), | ||
np.array([[0., np.inf], | ||
[0., 1]])] | ||
|
||
diag3 = [np.empty(shape=[0, 2])] | ||
|
||
|
||
def test_fit(): | ||
print(f" > Testing `fit`.") | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
deepcopy(vectorizer).fit(X=[diag1[0], diag2[0]]) | ||
|
||
|
||
def test_fit_empty(): | ||
print(f" > Testing `fit_empty`.") | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
deepcopy(vectorizer).fit(X=[diag3[0], diag3[0]]) | ||
|
||
|
||
def test_transform(): | ||
print(f" > Testing `transform`.") | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
deepcopy(vectorizer).fit_transform(X=[diag1[0], diag2[0], diag3[0]]) | ||
|
||
|
||
def test_transform_empty(): | ||
print(f" > Testing `transform_empty`.") | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
copy_vec = deepcopy(vectorizer).fit(X=[diag1[0], diag2[0]]) | ||
copy_vec.transform(X=[diag3[0], diag3[0]]) | ||
|
||
|
||
def test_set_output(): | ||
print(f" > Testing `set_output`.") | ||
import pandas as pd | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
deepcopy(vectorizer).set_output(transform="pandas") | ||
|
||
|
||
def test_compose(): | ||
print(f" > Testing composition with `sklearn.compose.ColumnTransformer`.") | ||
from sklearn.compose import ColumnTransformer | ||
for name, vectorizer in vectorizers.items(): | ||
print(f" >> Testing {name}") | ||
ct = ColumnTransformer([ | ||
(f"{name}-0", deepcopy(vectorizer), 0), | ||
(f"{name}-1", deepcopy(vectorizer), 1), | ||
(f"{name}-2", deepcopy(vectorizer), 2)] | ||
) | ||
ct.fit_transform(X=[diag1, diag2]) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VincentRouvreau I believe this is what you need part1