Shipping C++ code/libraries with Python package #1063

burgholzer · 2024-02-05T22:38:24Z

burgholzer
Feb 5, 2024

Hey 👋🏼

I am seeking some advice on how to best package/distribute a collection of tools we develop as part of our research (specifically, the Munich Quantum Toolkit; https://mqt.readthedocs.io/en/latest/) .

Some background and the main question

All of these tools are developed in C++17, exposed to Python using pybind11, and built into a Python package using scikit-build-core.
The tools are developed to work on all major operating systems and Python versions.
Structurally, there is a core library (called mqt-core), which many of the other libraries build on (e.g., mqt-qcec or mqt-qmap).
Historically, only the top-level tools where made available as Python packages on PyPI, and mqt-core was consumed as a submodule (using FetchContent).
However, recently, we also created Python bindings for the core library and started publishing them as a Python package.
This creates a situation where the C++ code in the top-level project depends on the submodule code of mqt-core, while the Python part of the top-level project depends on the mqt.core Python package.

The general question for this discussion is: How to best handle such a situation?

The desired solution and what we tried so far

Ideally, we would want to ship the core library C++ code (more specifically, probably, the compiled libraries) with the Python package itself, so that the respective targets can be found in the top-level CMake project given the cmake directory is appropriately added to the search locations and the package itself provides a <...>-config.cmake (it does).
I have seen header-only projects (such as pybind11) do that, which is simple, because all it needs is the header files.
Nanobind (which is not header-only) simply ships the complete C++ sources as part of the package install and lets the consuming project build nanobind as part of its build.
However, it is important that the code used to compile the core library Python package (mqt.core) and the top-level compiled extensions (e.g., mqt.qcec) is the same as, otherwise, pybind11 won't be able to recognize that the mqt.core Package provides bindings for a C++ type that is being used in the interface of a method within the mqt.qcec package.

By now, the mqt-core project provides installation instructions so that it can install all the necessary libraries as part of the Python package build. These also get properly added to site-packages. By default, any project builds static libraries. While this setup seemed to work at first (building the mqt.core package locally and letting it be discovered by FetchContent in the top-level project), once we actually released a new version of mqt.core with wheels built by cibuildwheel, we noticed all kinds of problems when trying to use the resulting package in the top-level project:

pybind11 was not able to identify a type that has bindings in mqt.core and is being used in mqt.qcec
The build in the top-level project fails due to unresolved symbols
There is a mismatch between the build and the install configuration type on Windows (for some reason Ninja+MSVC+scikit-build-core on Windows yields a debug config installation (filed as Windows install configuration defaults to debug scikit-build-core#629)

The next logical step would be to try and build shared libraries instead of static ones and distribute those with the project, e.g., via setting option(BUILD_SHARED_LIBS "Build using shared libraries" ON) in the main CMake file.
However, this creates another issue since when compiling a binary extension for a pybind module, the default symbol visibility is set to hidden.
Since none of the functions currently explicitly export symbols, the correspondingly built shared libraries cannot be linked when built alongside the pybind extension.
So I went on and read up on symbol visibility and how to set it explicitly (with the help of the GenerateExportHeader CMake module).
However, this seemed really hard if not impossible to get working cross-platform because as soon as I fixed a visibility-related issue for one compiler or one OS, one of the other OSes or compiler would complain that this is not allowed and I had to revert.
If necessary, more details are available in the following draft PRs:

🚸 Building Shared Libraries cda-tum/mqt-core#538 (core library)
✨ Building Shared Libraries cda-tum/mqt-qcec#352 (top-level project)

Now with a little more context, back to the general question: what is the best/recommended way to resolve this? Am I on the right track with building shared libraries and explicitly exporting symbols and do I just need to find whatever combination of visibility macros that works; or is there a better way to accomplish the overall goal.

A sketch of a minimal example

The following is a huge over-simplification, but aims to provide at least some kind of easier basis to reason about.

Core Project

Has a C++ class

class Circuit {...};

that is being compiled as part of a CMake library target.
Additionally, the class is bound in a pybind11 module that links to the above library and exposes the class to Python.

Top-level project

Depends on the CMake library target of the core project and provides a method that uses the Circuit class, e.g.,

bool check_equivalence(Circuit& circ1, Circuit& circ2);

as part of its own CMake library target.
Additionally, also the top-level project exposes its methods to Python via a pybind11 extension that links to the top-level project library.

(Assume that there are multiple top-level projects that might co-exist and all depend on the core library on the C++ as well as the Python side)

How would this best be packaged and distributed?
Please let me know, if you would need any further information!
Many thanks 🍀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scikit-build

Shipping C++ code/libraries with Python package #1063

{{title}}

Replies: 0 comments

Select a reply

scikit-build

Shipping C++ code/libraries with Python package #1063

burgholzer Feb 5, 2024

Some background and the main question

The desired solution and what we tried so far

A sketch of a minimal example

Core Project

Top-level project

Replies: 0 comments

burgholzer
Feb 5, 2024