-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Why was LTO disabled in version 1.22.0? #124
Comments
What pushed me back in January (103b45d) was ld erroneously introducing
_pei386_runtime_relocator references in certain (admittedly unusual)
circumstances, and I had to use "-fno-lto" to make it behave itself. I
don't personally feel LTO is all that valuable, and now it was actively
getting in my way, so I finally disabled it. I was pleasantly surprised to
see this also knocked ~9MiB off the distribution, which created even more
incentive to keep the change.
I wanted to provide an example, but I couldn't reproduce my issue from
January in w64devkit 1.21.0. I think now that I was experiencing a bug
specifically in Binutils 2.41 (9926896). If you re-enable LTO:
$ sed -i /--disable-lto/d Dockerfile
Build a toolchain, then try to extract, say, ___chkstk_ms from libgcc.a
using partial linking:
$ cc -r -u ___chkstk_ms -o chkstk.o -lgcc
$ nm chkstk.o
You'll see "U _pei386_runtime_relocator" in the nm listing. That shouldn't
be there. Add "-fno-lto" and it goes away, or disable LTO in the toolchain
like I did.
The Windows ports of the GNU ecosystem tend to be second class, and the
less-trodden features, such as LTO, receive less testing. It's certainly
had its sharp edges in the Mingw-w64 ecosystem:
https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public/thread/20230401111018.qn3vrsbgn7mfv3ge@pali/
(EDIT: Had to fix this link. GitHub's email support is so aggressively mediocre.)
Along these lines, if you build the latest u-config (including the one
included in w64devkit 1.22) using the w64devkit 1.21 toolchain with LTO
enabled, you'll get a broken binary without warning:
$ cc -nostartfiles -flto win32_main.c
$ ./a
SEGV
You'd need to also add a not-obvious "-u mainCRTStartup" to the command.
This was fixed in Binutils 2.41, so the above command will work correctly
in an LTO-restored build of w64devkit, but my overall confidence hasn't
raised much.
Perhaps LTO is valuable in huge C++ projects to deduplicate piles of
template instantiations, or projects that break hot paths across multiple
translation units (preventing inlining). The typical case appears to be
zero or little (a few percentage points) performance improvement from LTO.
|
@skeeto LTO's biggest benefit seems to be not so much speed, but code size. It makes a bigger difference than you'd expect in that regard, there are some benchmarks here: Exposing you to more compiler bugs is unfortunately true, and even more so for *-w64-mingw32, though. I wanted to build cppcheck with LTO as an example and encountered this: It's supposedly fixed, but I'd still like to revisit when gcc 14.1 is released. I still think enabling it in the toolchain is overall a good idea. And potentially it's a good avenue to reduce some of w64devkit's size. |
Thanks, @Peter0x44, that was an interesting talk. My main takeaways from the talk:
(Unfortunately, as I'm sure you already realize, PGO isn't going to be practical when building w64devkit itself due to cross-compilation.) If I re-enable LTO per the If you'd like to reproduce this yourself, I used w64devkit's --- a/cppcheck.mak
+++ b/cppcheck.mak
@@ -3,5 +3,5 @@
obj := $(src:.cpp=.o)
-CXXFLAGS := -w -Os -Ilib $(addprefix -I,$(ext))
+CXXFLAGS := -w -Os -Ilib -flto -fno-declone-ctor-dtor $(addprefix -I,$(ext))
cppcheck.exe: $(obj)
- $(CXX) -s -o $@ $(obj) -lshlwapi
+ $(CXX) -s -o $@ -Os -flto=auto $(obj) -lshlwapi
cppcheck: $(obj) Disabling LTO is a bit of experiment. It sat on the master branch for over two months without objections, so I felt comfortable trying it in a release. I could be persuaded it's worth reverting back to the default, especially as LTO-related bugs are fixed. but I'm not there yet. |
Perhaps bigger gains can be achieved when building gcc or potentially busybox-w32 with it also. It's on my to-do list to investigate the potential benefits there. |
For some time now (six years or so?) my release builds of busybox-w32 have used LTO. It makes the binaries smaller and doesn't seem to have resulted in any issues. The more recent clang/aarch64 build doesn't use LTO as it made the binary slightly larger. By 0.2%. Oh no! |
I've decided for now to continue with LTO still disabled in the 1.23.0
release today.
|
@rmyorston aarch64-w64-mingw32 support was recently merged for gcc 15, perhaps it's worth investigating if it would reduce the executable size. Maybe it's worth considering for w64devkit also, but there are few WoA devices that can be purchased, and gcc won't support it officially until next year, I wouldn't put it on a high priority. Just something to be aware of. |
Please return LTO as soon, as possible. I develop project for DOS and code size is very important for me. I've googled this problem and it seems like there is some work in progress to fix it. For now it's recommended to mark _pei386_runtime_relocator as used. |
Another thought for LTO, but I don't know the practical implications of it yet. The system compiler of arch has: |
That's just for the intermediate object files isn't it? I don't think it should have any effect on the final binary (where lto information gets stripped). |
Could this be reenabled? This breaks existing build configurations from performance optimized libraries. |
This broke my build of MiniVM. It was a simple enough fix, but slowed down the VM's GC significantly. |
I wanted to experiment with this again, so I went to build an lto-enabled w64devkit for my own usage, but it had some minor problems.
Resulted in
libsquare.a was still created, but linking it did not work
What fixed it was copying I suspect this has to do with the strange way gcc is configured by w64devkit Lines 256 to 257 in 8ea0fee
with the sysroot having $ARCH at the end of it I did not have time to test any theories relating to this, but perhaps @skeeto might have some ideas. I haven't found this to be necessary in my own experiments building mingw-w64 toolchains outside of w64devkit. |
Hi @skeeto , thanks for making this. Can you explain (or point me to resources) why LTO got disabled in version 1.22.0?
Thank you.
The text was updated successfully, but these errors were encountered: