Skip to content

Commit

Permalink
📝 🎨 documentation update
Browse files Browse the repository at this point in the history
  • Loading branch information
GiulioRossetti committed May 14, 2024
1 parent 9923f79 commit b8e2b0b
Show file tree
Hide file tree
Showing 20 changed files with 217 additions and 128 deletions.
109 changes: 98 additions & 11 deletions docs/bibliography.rst

Large diffs are not rendered by default.

14 changes: 8 additions & 6 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
CDlib - Community Detection Library
===================================

``CDlib`` is a Python software package that allows to extract, compare and evaluate communities from complex networks.
``CDlib`` is a Python software package that allows extracting, comparing, and evaluating communities from complex networks.

The library provides a standardized input/output for several existing Community Detection algorithms.
The implementations of all CD algorithms are inherited from existing projects, each one of them acknowledged in the dedicated method reference page.
The library provides a standardized input/output for several Community Detection algorithms.
The implementations of all CD algorithms are inherited from existing projects; each acknowledged in the dedicated method reference page.

If you would like to test ``CDlib`` functionalities without installing it on your machine consider using the preconfigured Jupyter Hub instances offered by the H2020 `SoBigData++`_ research project.
If you want to test ``CDlib`` functionalities without installing it on your machine, consider using the preconfigured Jupyter Hub instances offered by the EU funded `SoBigData`_ research infrastructure.

If you use ``CDlib`` in your research please cite the following paper:

Expand All @@ -36,6 +36,7 @@ CDlib Dev Team
`Letizia Milli`_ Community Models Integration
`Rémy Cazabet`_ Visualization
`Salvatore Citraro`_ Community Models Integration
`Andrea Failla`_ Community Models Integration
======================= ============================


Expand All @@ -50,10 +51,11 @@ CDlib Dev Team
bibliography.rst


.. _`Giulio Rossetti`: http://www.about.giuliorossetti.net
.. _`Giulio Rossetti`: http://giuliorossetti.github.io
.. _`Letizia Milli`: https://github.com/letiziam
.. _`Salvatore Citraro`: https://github.com/dsalvaz
.. _`Rémy Cazabet`: http://cazabetremy.fr
.. _`Andrea Failla`: http://andreafailla.github.io
.. _`Source`: https://github.com/GiulioRossetti/CDlib
.. _`Distribution`: https://pypi.python.org/pypi/CDlib
.. _`SoBigData++`: https://sobigdata.d4science.org/group/sobigdata-gateway/explore?siteId=20371853
.. _`SoBigData`: https://sobigdata.d4science.org/group/sobigdata-gateway/explore?siteId=20371853
21 changes: 10 additions & 11 deletions docs/installing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,16 @@ Installing CDlib

``CDlib`` *requires* python>=3.8.

To install the latest version of our library just download (or clone) the current project, open a terminal and run the following commands:
To install the latest version of our library, download (or clone) the current project, open a terminal, and run the following commands:

.. code-block:: python
pip install -r requirements.txt
pip install -r requirements_optional.txt # (Optional) this might not work in Windows systems due to C-based dependencies.
pip install -r requirements_optional.txt # (Optional) This might not work in Windows systems due to C-based dependencies.
pip install .
Alternatively use pip
Alternatively, use pip

.. code-block:: python
Expand Down Expand Up @@ -46,7 +46,7 @@ Optional Dependencies
PyPi package
^^^^^^^^^^^^

To simplify the installation process, the default installation does not include optional dependencies (e.g., ``graph-tool``). If you need them, you can install them manually or run the following command:
The default installation does not include optional dependencies (e.g., ``graph-tool``) to simplify the installation process. If you need them, you can install them manually or run the following command:

.. code-block:: python
Expand All @@ -70,34 +70,34 @@ This option will install all optional dependencies accessible with the flag C an
Advanced
^^^^^^^^

Due to some strict requirements, the installation of a subset of optional dependencies is left outside the previous procedures.
Due to strict requirements, installing a subset of optional dependencies is left outside the previous procedures.

----------
graph-tool
----------

``CDlib`` integrates the support for SBM models offered by ``graph-tool``.
To install it refer to the official `documentation <https://git.skewed.de/count0/graph-tool/wikis/installation-instructions>`_ and install the conda-forge version of the package (or the deb version if in a *nix system).
To install it, refer to the official `documentation <https://git.skewed.de/count0/graph-tool/wikis/installation-instructions>`_ and install the conda-forge version of the package (or the deb version if in a *nix system).
------
ASLPAw
------

Since its 2.1.0 release ``ASLPAw`` relies on ``gmpy2`` whose installation through pip is not easy to automatize due to some C dependencies.
To address such issue test the following recipe:
Since its 2.1.0 release, ``ASLPAw`` relies on ``gmpy2``, whose installation through pip is difficult to automate due to some C dependencies.
To address such an issue, test the following recipe:

.. code-block:: python
conda install gmpy2
pip install shuffle_graph>=2.1.0 similarity-index-of-label-graph>=2.0.1 ASLPAw>=2.1.0
In case ASLPAw installation fails, please refer to the official ``gmpy2`` `repository <https://gmpy2.readthedocs.io/en/latest/intro.html#installation>`_.
If ASLPAw installation fails, please refer to the official ``gmpy2`` `repository <https://gmpy2.readthedocs.io/en/latest/intro.html#installation>`_.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Optional Dependencies (Conda package)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

``CDlib`` relies on a few packages not available through conda: to install them please use pip.
``CDlib`` relies on a few packages unavailable through conda: to install them, please use pip.

.. code-block:: python
Expand All @@ -109,4 +109,3 @@ Optional Dependencies (Conda package)
In case ASLPAw installation fails, please refer to the official ``gmpy2`` repository `repository <https://gmpy2.readthedocs.io/en/latest/intro.html#installation>`_.



10 changes: 7 additions & 3 deletions docs/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Overview
********

``cdlib`` is a powerful Python package that allows for the extraction, comparison, and evaluation of communities from complex networks.
``cdlib`` is a powerful Python package that allows for extracting, comparing, and evaluating communities from complex networks.

The potential audience for ``cdlib`` includes mathematicians, physicists, biologists, computer scientists, and social scientists.

Expand All @@ -24,8 +24,12 @@ We welcome contributions from the community.
EU H2020
--------

``CDlib`` is a result of an European H2020 project:
``CDlib`` is a result of a stream of European H2020 projects:

- SoBigData_ “Social Mining & Big Data Ecosystem”: under the scheme “INFRAIA-1-2014-2015: Research Infrastructures”, grant agreement #654024.
- "SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics" (http://www.sobigdata.eu);
- "SoBigData.it – Strengthening the Italian RI for Social Mining and Big Data Analytics" – Prot. IR0000013 – Avviso n. 3264 del 28/12/2021;
- FAIR: Future Artificial Intelligence Research. EU NextGenerationEU programme under the funding schemes PNRR-PE-AI.

.. _SoBigData: http://www.sobigdata.eu

.. _SoBigData: http://www.sobigdata.eu
6 changes: 3 additions & 3 deletions docs/reference/algorithms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ To maintain the library organization as clean and resilient to changes as possib
1. Algorithms designed for static networks, and
2. Algorithms designed for dynamic networks.

Moreover, within each category, ``CDlib`` groups together approaches sharing the same set of high-level characteristics.
Moreover, within each category, ``CDlib`` groups together approaches sharing the same high-level characteristics.

In particular, static algorithms are organized into:

Expand Down Expand Up @@ -42,7 +42,7 @@ Ensemble Methods

``CDlib`` implements basilar ensemble facilities to simplify the design of complex analytical pipelines requiring the instantiation of several community discovery algorithms.

Learn how to (i) pool multiple algorithms on the same network, (ii) perform fitness-driven methods' parameter grid search, and (iii) combine the two in few lines of code.
Learn how to (i) pool multiple algorithms on the same network, (ii) perform fitness-driven methods' parameter grid search, and (iii) combine the two in a few lines of code.


.. toctree::
Expand All @@ -54,7 +54,7 @@ Learn how to (i) pool multiple algorithms on the same network, (ii) perform fitn
Summary
-------

If you need a summary on the available algorithms and their properties (accepted graph types, community characteristics, computational complexity) refer to:
If you need a summary of the available algorithms and their properties (accepted graph types, community characteristics, computational complexity), refer to:

.. toctree::
:maxdepth: 1
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/benchmark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Synthetic Benchmarks
********************

Evaluating Community Detection algorithms on ground truth communities can be tricky when the annotation is based on external semantic information, not on topological ones.
Evaluating Community Detection algorithms on ground truth communities can be tricky when the annotation is based on external semantic information, not topological ones.

For this reason, ``cdlib`` integrates synthetic network generators with planted community structures.

Expand Down Expand Up @@ -42,7 +42,7 @@ Benchmarks for node-attributed static networks.
Dynamic Networks with Community Ground Truth
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Time evolving network topologies with planted community life-cycles.
Time-evolving network topologies with planted community life cycles.
All generators return a tuple: (``dynetx.DynGraph``, ``cdlib.TemporalClustering``)

.. autosummary::
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/cd_algorithms/algorithms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@
Algorithms' Table
=================

In the following table you can find an up-to-date list of the Community Detection algorithms made available within ``cdlib``.
The following table shows an up-to-date list of the Community Detection algorithms made available within ``cdlib``.

Algorithms are listed in alphabetical order along with:

- a few additional information on the graph typologies they handle, and
- the main expected characteristics of the clustering they produce,
- (when available) the theoretical computational complexity as estimated by their authors.
- (when available) the theoretical computational complexity estimated by their authors.

All algorithms are assumed - apart few, reported, exceptions - to work on undirected and unweighted graphs.
Apart from a few reported exceptions, all algorithms are assumed to work on undirected and unweighted graphs.

**Complexity notation.** When discussing the time complexity the following notation is assumed:
**Complexity notation.** When discussing the time complexity, the following notation is assumed:

- *n*: number of nodes
- *m*: number of edges
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/cd_algorithms/edge_clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Edge Clustering
===============

Algorithms falling in this category generates communities composed by edges.
Algorithms falling in this category generate communities composed of edges.
They return as result a ``EdgeClustering`` object instance.

.. note::
Expand Down
24 changes: 12 additions & 12 deletions docs/reference/cd_algorithms/node_clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ Static Community Discovery
Node Clustering
---------------

Algorithms falling in this category generate communities composed by nodes.
The communities can represent neat, *crisp*, partition as well as *overlapping* or even *fuzzy* ones.
Algorithms falling in this category generate communities composed of nodes.
The communities can represent neat, *crisp*, partitions and *overlapping* or even *fuzzy* ones.

.. note::
The following lists are aligned to CD methods available in the *GitHub main branch* of `CDlib`_.
Expand All @@ -21,8 +21,8 @@ The communities can represent neat, *crisp*, partition as well as *overlapping*
Crisp Communities
^^^^^^^^^^^^^^^^^

A clustering is said to be a *partition* if each node belongs to one and only one community.
Methods in this subclass return as result a ``NodeClustering`` object instance.
A clustering is considered a *partition* if each node belongs to one and only one community.
As a result, methods in this subclass return a ``NodeClustering`` object instance.


.. autosummary::
Expand Down Expand Up @@ -74,7 +74,7 @@ Overlapping Communities
^^^^^^^^^^^^^^^^^^^^^^^

A clustering is said to be *overlapping* if any generic node can be assigned to more than one community.
Methods in this subclass return as result a ``NodeClustering`` object instance.
As a result, methods in this subclass return a ``NodeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand Down Expand Up @@ -113,8 +113,8 @@ Methods in this subclass return as result a ``NodeClustering`` object instance.
Fuzzy Communities
^^^^^^^^^^^^^^^^^

A clustering is said to be a *fuzzy* if each node can belongs (with a different degree of likelihood) to more than one community.
Methods in this subclass return as result a ``FuzzyNodeClustering`` object instance.
A clustering is *fuzzy* if each node can belong (with a different degree of likelihood) to more than one community.
As a result, methods in this subclass return a ``FuzzyNodeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand All @@ -127,7 +127,7 @@ Methods in this subclass return as result a ``FuzzyNodeClustering`` object insta
Node Attribute
^^^^^^^^^^^^^^

Methods in this subclass return as result a ``AttrNodeClustering`` object instance.
As a result, methods in this subclass return a ``AttrNodeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand All @@ -140,7 +140,7 @@ Methods in this subclass return as result a ``AttrNodeClustering`` object instan
Bipartite Graph Communities
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Methods in this subclass return as result a ``BiNodeClustering`` object instance.
As a result, methods in this subclass return a ``BiNodeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand All @@ -156,7 +156,7 @@ Methods in this subclass return as result a ``BiNodeClustering`` object instance
Antichain Communities
^^^^^^^^^^^^^^^^^^^^^

Methods in this subclass are designed to extract communities from Directed Acyclic Graphs (DAG) and return as result a ``NodeClustering`` object instance.
Methods in this subclass are designed to extract communities from Directed Acyclic Graphs (DAG) and return. As a result, a ``NodeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand All @@ -168,8 +168,8 @@ Methods in this subclass are designed to extract communities from Directed Acycl
Edge Clustering
---------------

Algorithms falling in this category generates communities composed by edges.
They return as result a ``EdgeClustering`` object instance.
Algorithms falling in this category generate communities composed of edges.
They return, as a result, a ``EdgeClustering`` object instance.

.. autosummary::
:toctree: algs/
Expand Down
21 changes: 10 additions & 11 deletions docs/reference/cd_algorithms/temporal_clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Dynamic Community Discovery
===========================

Algorithms falling in this category generates communities that evolve as time goes by.
Algorithms falling in this category generate communities that evolve as time goes by.


.. automodule:: cdlib.algorithms
Expand All @@ -11,16 +11,16 @@ Algorithms falling in this category generates communities that evolve as time go
Instant Optimal
^^^^^^^^^^^^^^^

This first class of approaches is derived directly from the application of static community discovery methods to the dynamic case.
A succession of steps is used to model network evolution, and for each of them is identified an optimal partition.
This first class of approaches is derived directly from applying static community discovery methods to the dynamic case.
A succession of steps is used to model network evolution, and an optimal partition is identified for each.
Dynamic communities are defined from these optimal partitions by specifying relations that connect topologies found in different, possibly consecutive, instants.

``cdlib`` implements a templating approach to transform every static community discovery algorithm in a dynamic one following a standard *Two-Stage* approach:
``cdlib`` implements a templating approach to transform every static community discovery algorithm into a dynamic one following a standard *Two-Stage* approach:

- Identify: detect static communities on each step of evolution;
- Match: align the communities found at step t with the ones found at step t − 1, for each step.
- Match: align the communities found at step t with those found at step t − 1, for each step.

Here's an example of a two-step built on top of Louvain partitions of a dynamic snapshot-sequence graph (where each snapshot is an LFR synthetic graph).
Here is an example of a two-step built on top of Louvain partitions of a dynamic snapshot-sequence graph (where each snapshot is an LFR synthetic graph).

.. code-block:: python
Expand All @@ -34,28 +34,27 @@ Here's an example of a two-step built on top of Louvain partitions of a dynamic
coms = algorithms.louvain(g) # here any CDlib algorithm can be applied
tc.add_clustering(coms, t)
For what concerns the second stage (snapshots' node clustering matching) it is possible to parametrize the set similarity function as follows (example made with a standard Jaccard similarity):
For what concerns the second stage (snapshots' node clustering matching), it is possible to parametrize the set similarity function as follows (example made with a standard Jaccard similarity):

.. code-block:: python
jaccard = lambda x, y: len(set(x) & set(y)) / len(set(x) | set(y))
matches = tc.community_matching(jaccard, two_sided=True)
For all details on the available methods to extract and manipulate dynamic communities please refer to the ``TemporalClustering`` documentation.
For all details on the available methods to extract and manipulate dynamic communities, please refer to the ``TemporalClustering`` documentation.

^^^^^^^^^^^^^^^^^^
Temporal Trade-Off
^^^^^^^^^^^^^^^^^^

Algorithms belonging to the Temporal Trade-off class process iteratively the evolution of the network.
Moreover, unlike Instant optimal approaches, they take into account the network and the communities found in the previous step – or n-previous steps – to identify communities in the current one.
Moreover, unlike Instant optimal approaches, they consider the network and the communities found in the previous step – or n-previous steps – to identify communities in the current one.
Dynamic Community Discovery algorithms falling into this category can be described by an iterative process:

- Initialization: find communities for the initial state of the network;
- Update: for each incoming step, find communities at step t using graph at t and past information.
- Update: find communities at step t using graph at t and past information for each incoming step.

.. autosummary::
:toctree: algs/

tiles

Loading

0 comments on commit b8e2b0b

Please sign in to comment.