Replies: 1 comment 1 reply
-
here's the code for the above plots from collections import Counter
from itertools import cycle
from sklearn.metrics import auc
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-bright')
def perfprof_plot(df, perf_measure):
lines = ["-","--","-.",":"]
linecycler = cycle(lines)
plt.figure(figsize=(20,10))
tab = df.pivot(index="algorithm", columns="dataset", values=perf_measure)
n_problems = len(tab.columns)
for name, v in tab.iterrows():
v = v[v>=0].sort_values()
n_gt0 = v.shape[0]
perf_x = [0]
perf_y = [n_gt0/n_problems]
for k, v1 in Counter(v).items():
if k == 0:
n_gt0 = n_gt0 - v1
continue
perf_x.append(k)
perf_y.append(n_gt0/n_problems)
n_gt0 = n_gt0 - v1
plt.plot(perf_x, perf_y, next(linecycler), label=f'{name}\n(AUC = {np.round(auc(perf_x, perf_y),2)})')
plt.xlabel(perf_measure, fontsize=18)
plt.ylabel("P[alg_perf >= x]", fontsize=18)
plt.legend(loc='upper center', bbox_to_anchor=(0.5, 1.1),
ncol=10, fancybox=True, shadow=True, prop={'size': 16})
df_plot["log_model_size"] = np.log(df_plot.model_size)
algs = ["AFP", "uDSR", "FFX", "GP-GOMEA", "Operon", "PS-Tree", "ITEA"]
perfprof_plot(df_plot, "r2_test")
plt.savefig("r2_perf_full.png")
perfprof_plot(df_plot[df_plot.algorithm.isin(algs)], "r2_test")
plt.savefig("r2_perf.png")
perfprof_plot(df_plot[df_plot.algorithm.isin(algs)], "log_model_size")
plt.savefig("size_perf.png") |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Since we will likely end discussion about datasets tomorrow, it is time to start discussing how to compare algorithms. One idea we can borrow from optimization competition is to use performance profile (or a CDF of the performance) to compare the performance of the algorithms when considering a single metric:
We can read this plot as: "what is the fraction of problems in which algorithm A achieve an R^2 greater than x?" This plot also has the convenience of handling the cases of R^2<0 better than the errorbars.
The ACU can give us an aggregated value to rank algorithms.
Besides these plots, we should still keep the pareto front of a pair of metrics and the histogram of how many times an algorithm obtained a rank
k
or higher.Beta Was this translation helpful? Give feedback.
All reactions