Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/exported symbols #185

Closed
wants to merge 19 commits into from
Closed

Refactor/exported symbols #185

wants to merge 19 commits into from

Conversation

mvukov
Copy link
Owner

@mvukov mvukov commented Oct 1, 2023

I started this effort inspired by #180 (where I learned about dynamic symbol exports) and #148.

Here are some thoughts:

  • symbols rosidl_typesupport_introspection_c__identifier (C) and rosidl_typesupport_introspection_cpp::typesupport_identifier (C++) must be kept/exposed globally such that a rmw serdes works as expected. Both for shared/static libraries. Note: in ros 2 setup w/ C++ nodes w/o plugins this is sufficient. The current solution is a rmw_cyclonedds patch.
  • similarly, a subset of symbols from class_loader::impl namespace must be kept/exposed globally such that plugin registration works as expected (Plugin Registration when using LLVM's libc++ in lieu of stdc++ #180). This is in particular interesting when dynamic_cast is implemented by comparing pointers, e.g. clang+libc++ AFAIK. The current solution for this is to compiler class_loader lib explicitly as a shared lib, cf. https://github.com/mvukov/rules_ros2/blob/main/repositories/class_loader.BUILD.bazel#L7.
  • For Python, it looks like more symbols needs to exported, haven't figure out which ones yet.

In Bazel context and different compiler/linker combos we need to handle properly gcc/clang linking, ros 2 rmw serdes intricacies.

This PR aims to nail all those problems. We basically have two options:

  • Use "-Wl,--export-dynamic" as linker flags for all C++ stuff. Searching the web, I couldn't find any downsides of this TBH.
  • We can use finer granularity alternative, "-Wl,--dynamic-list=$(location @com_github_mvukov_rules_ros2//ros2:exported_symbols.lds)", see the PR where we give a list of symbols for dynamic export. For C++ code this should be sufficient, but for Python stuff to work some symbols are still missing (anything that calls ROS 2 rmw from python segfaults (IIRC) at the moment with this approach).

The part that I don't like much is that in any case we need to every "main" Python script the os.RTLD_GLOBAL dlopen flag for this to work; got inspiration from rosbag2 tests. This may not be the nicest solution but it works. If we decide to go this route this will be properly documented.

@ahans @mikebauer @kgreenek @lalten I'd like to hear your opinions.

@mvukov mvukov marked this pull request as ready for review November 29, 2023 16:20
@brians-neptune
Copy link

Seems to me like you have to patch every Python binary to load plugins like this from Python with libc++, independent of rules_ros2, correct? If I'm correct that this is a fundamental restriction of that combination of things outside of this project, then I think adding that so CI can test libc++ and telling users that they will have to do the same is fine.

From past experience with libc++ incompatibilities, I think there just aren't very many people using libc++ in complex open source projects. I think there's much more usage in proprietary codebases (which have no problem patching Python to add the flag everywhere), and leaf dependencies that just want run msan/etc (which don't run into problems like this).

Also, for reference https://python-dev.python.narkive.com/P6UYSCLF/problems-with-python-s-default-dlopen-flags is some (only vaguely civil) discussion around the problems RTLD_GLOBAL will cause with some other libraries. I think the answer if anybody wants to use ROS2, libc++, and one of those annoying libraries is "don't". C++ is much less prone to that class of problems due to name mangling too.

@mikebauer
Copy link

As my interest is mainly on the C++ side, I don't have much commentary on the python patching stuff, but the change of adding "-Wl,--export-dynamic" seems reasonable to me as discussed in #180. Thanks for your efforts in doing further research on potential downsides. No objections from me here!

@mvukov
Copy link
Owner Author

mvukov commented Mar 10, 2024

clang+libc++ issue(s) just served me to investigate the mentioned problems above. Issues with handling global symbol are very relevant for other compiler and libc combinations. This PR basically aims to nail all those issues.

On the C++ side, users won't have to do a thing: I already added linker flag at relevant places.

But, yes, all Python binaries will need be patched to have something like sys.setdlopenflags(os.RTLD_GLOBAL | os.RTLD_NOW) before any ros related code is imported. Personally, I don't consider this to be a big issue, given the issues this PR solves on the C++ side. Having to patch third-party ROS 2 Python binaries might be an annoyance, that's a fair point.

@brians-neptune
Copy link

I've been diving deep into something that I only just realized is related to this. I've got another solution, which I think is better: make it so rosidl_typesupport_cpp::typesupport_identifier etc are only linked once. I'm close to having that implemented (and having permission from my employer to share it). All the existing tests pass with the strcmp patches removed with GCC.

Linking any symbol into multiple ELF libraries (or multiple times into a single ELF library) is kinda-sorta a violation of the One Definition Rule. For things like string constants it mostly works (except for having multiple pointer values), but I've had more complex global variables give weird segfaults from doing that before. Also linking things multiple times bloats the output binaries/shared libraries significantly.

Using cc_shared_library and rearranging the cc_common rules was harder than it looked, but I'm pretty sure I've got it working. I wrote a little test utility to look through all the binaries and shared objects in a runfiles tree and check for duplicate symbols, which passes with my changes.

What should I run to check if my changes fix everything you're aiming to fix here? I'm not sure if bazel test --config=clang //... is enough or if there are more changes to enable libc++.

@mvukov
Copy link
Owner Author

mvukov commented Mar 27, 2024

Hi, very nice! Thanks for the efforts. I think we covered regular and edge cases well enough with tests. So, if tests compiled with gcc and clang pass both in the root of the repo and for examples, we're good.

@mvukov
Copy link
Owner Author

mvukov commented Apr 5, 2024

@brians-neptune Following your comments and suggestions I dig deeper into this matter and took another shot at this: please take a look at #299. That one provides a working solution that is minimal, transparent for users and looks like it works both for gcc and clang.

@mvukov
Copy link
Owner Author

mvukov commented Apr 5, 2024

If #299 works as expected, I'll just close this PR (for good :)).

@mvukov
Copy link
Owner Author

mvukov commented Apr 9, 2024

Closing this one in favor of #299 .

@mvukov mvukov closed this Apr 9, 2024
@mvukov mvukov deleted the refactor/exported_symbols branch April 9, 2024 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants