nx-parallel
provides flexible parallel computing capabilities, allowing you to control settings like backend
, n_jobs
, verbose
, and more. This can be done through two configuration systems: joblib
and NetworkX
. This guide explains how to configure nx-parallel
using both systems.
nx-parallel
relies on joblib.Parallel
for parallel computing. You can adjust its settings through the joblib.parallel_config
class provided by joblib
. For more details, check out the official joblib documentation.
from joblib import parallel_config
# Setting global configs
parallel_config(n_jobs=3, verbose=50)
nx.square_clustering(H)
# Setting configs in a context
with parallel_config(n_jobs=7, verbose=0):
nx.square_clustering(H)
Please refer the official joblib's documentation to better understand the config parameters.
Note: Ensure that nx.config.backends.parallel.active = False
when using joblib
for configuration, as NetworkX configurations will override joblib.parallel_config
settings if active
is True
.
To use NetworkX’s configuration system in nx-parallel
, you must set the active
flag (in nx.config.backends.parallel
) to True
.
When you import NetworkX, it automatically sets default configurations for all installed backends, including nx-parallel
.
import networkx as nx
print(nx.config)
Output:
NetworkXConfig(
backend_priority=[],
backends=Config(
parallel=ParallelConfig(
active=False,
backend="loky",
n_jobs=None,
verbose=0,
temp_folder=None,
max_nbytes="1M",
mmap_mode="r",
prefer=None,
require=None,
inner_max_num_threads=None,
backend_params={},
)
),
cache_converted_graphs=True,
)
As you can see in the above output, by default, active
is set to False
. So, to enable NetworkX configurations for nx-parallel
, set active
to True
. Please refer the NetworkX's official backend and config docs for more on networkx configuration system.
# enabling networkx's config for nx-parallel
nx.config.backends.parallel.active = True
# Setting global configs
nxp_config = nx.config.backends.parallel
nxp_config.n_jobs = 3
nxp_config.verbose = 50
nx.square_clustering(H)
# Setting config in a context
with nxp_config(n_jobs=7, verbose=0):
nx.square_clustering(H)
The configuration parameters are the same as joblib.parallel_config
, so you can refer to the official joblib's documentation to better understand these config parameters.
In nx-parallel
, there's a _configure_if_nx_active
decorator applied to all algorithms. This decorator checks the value of active
(in nx.config.backends.parallel
) and then accordingly uses the appropriate configuration system (joblib
or networkx
). If active=True
, it extracts the configs from nx.config.backends.parallel
and passes them in a joblib.parallel_config
context manager and calls the function in this context. Otherwise, it simply calls the function.
You can use both NetworkX’s configuration system and joblib.parallel_config
together in nx-parallel
. However, it’s important to understand their interaction.
Example:
# Enable NetworkX configuration
nx.config.backends.parallel.active = True
nx.config.backends.parallel.n_jobs = 6
# Global Joblib configuration
joblib.parallel_config(backend="threading")
with joblib.parallel_config(n_jobs=4, verbose=55):
# NetworkX config for nx-parallel
# backend="loky", n_jobs=6, verbose=0
nx.square_clustering(G, backend="parallel")
# Joblib config for other parallel tasks
# backend="threading", n_jobs=4, verbose=55
joblib.Parallel()(joblib.delayed(sqrt)(i**2) for i in range(10))
-
NetworkX Configurations for nx-parallel: When calling functions within
nx-parallel
, NetworkX’s configurations will override those specified by Joblib. For example, thenx.square_clustering
function will use then_jobs=6
setting fromnx.config.backends.parallel
, regardless of any Joblib settings within the same context. -
Joblib Configurations for Other Code: For any other parallel code outside of
nx-parallel
, such as a direct call tojoblib.Parallel
, the configurations specified within the Joblib context will be applied.
This behavior ensures that nx-parallel
functions consistently use NetworkX’s settings when enabled, while still allowing Joblib configurations to apply to non-NetworkX parallel tasks.
Key Takeaway: When both systems are used together, NetworkX's configuration (nx.config.backends.parallel
) takes precedence for nx-parallel
functions. To avoid unexpected behavior, ensure that the active
setting aligns with your intended configuration system.
-
Parameter Handling: The main difference is how
backend_params
are passed. Since, in networkx configurations are stored as a@dataclass
, we need to pass them as a dictionary, whereas injoblib.parallel_config
you can just pass them along with the other configurations, as shown below:nx.config.backends.parallel.backend_params = {"max_nbytes": None} joblib.parallel_config(backend="loky", max_nbytes=None)
-
Default Behavior: By default,
nx-parallel
looks for configs injoblib.parallel_config
unlessnx.config.backends.parallel.active
is set toTrue
.
When the only networkx backend you're using is nx-parallel
, then either of the NetworkX or joblib
configuration systems can be used, depending on your preference.
But, when working with multiple NetworkX backends, it's crucial to ensure compatibility among the backends to avoid conflicts between different configurations. In such cases, using NetworkX's configuration system to configure nx-parallel
is recommended. This approach helps maintain consistency across backends. For example:
nx.config.backend_priority = ["another_nx_backend", "parallel"]
nx.config.backends.another_nx_backend.config_1 = "xyz"
joblib.parallel_config(n_jobs=7, verbose=50)
nx.square_clustering(G)
In this example, if another_nx_backend
also internally utilizes joblib.Parallel
(without exposing it to the user) within its implementation of the square_clustering
algorithm, then the nx-parallel
configurations set by joblib.parallel_config
will influence the internal joblib.Parallel
used by another_nx_backend
. To prevent unexpected behavior, it is advisable to configure these settings through the NetworkX configuration system.
Future Synchronization: We are working on synchronizing both configuration systems so that changes in one system automatically reflect in the other. This started with PR#68, which introduced a unified context manager for nx-parallel
. For more details on the challenges of creating a compatibility layer to keep both systems in sync, refer to Issue#76.
If you have feedback or suggestions, feel free to open an issue or submit a pull request.
Thank you :)