-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite IPAddress without union #225
base: master
Are you sure you want to change the base?
Conversation
Using a union allowed undefined behavior for 8-bit integer INPUT and OUTPUT arguments through a 32-bit integer (and vice versa). Write throught union::uint8_t[16] and read thoutght union::uint32_t[4] is a undefined behavior. C++ standard says (https://timsong-cpp.github.io/cppwp/n4659/class.union#1): "At most one of the non-static data members of an object of union type can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [ Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence, and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem]. — end note ]" and (https://timsong-cpp.github.io/cppwp/n4659/class.mem#22): "Two standard-layout unions are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in any order) have layout-compatible types." you can't hope that compilers don't seem to do undefined behavior at the moment Also created espressif/arduino-esp32#8829 PR dependent on changing the current API.
Begin index didn't be selected.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #225 +/- ##
==========================================
- Coverage 95.53% 95.42% -0.11%
==========================================
Files 16 16
Lines 1075 1050 -25
==========================================
- Hits 1027 1002 -25
Misses 48 48 ☔ View full report in Codecov by Sentry. |
I have problems with all those heavy changes in this PR, most of which seem to be motivated by using C++ standard library containers and patterns. Don't you think the problem can be addressed by enforcing a packed union, i.e. something like this union {
uint8_t bytes[16];
uint32_t dword[4];
} _address __attribute__((packed, aligned(4))); Although I have not invented this particular instance I may add that I have used (and use) this pattern repeatedly over the years, most recently here and here. I have yet to experience undefined behaviour. TL;DR: No to heavy changes including introduction of C++ STL types, lighter tough solutions are preferred. |
This trouble with using of union is a undefined behavior when value was written as
|
this functionality can be implemented through a C-style array |
Thanks for changing back to C-style array -- that is a separate change (from just removing the union). I think removing the union makes a lot of sense, as it added very little benefit. In practice it probably has the same result (even if accessing through alternate union members has unspecified behaviour, I think in most cases they would end up in the overlapping memory and just work). But then in practice, I don't know how often the 32-bit word variations are actually used. An IPv6 would rarely be handled as 32-bit words... either bytes or 16-bit chunks (how it is displayed) makes sense. For an IPv4 address there actually is a defined representation as a 32-bit integer, e.g. http://2398766798 is the same as http://142.250.70.206, and that may have been what the original intention was. However on little endian systems the octets 0x83, 0xFA, 0x46, 0xCE, in that order, will form the integer (uint32_t) 3460758158, so while each 4-byte part of the array can be accessed as a 32-bit value, the result is platform byte-order dependent, and may not necessarily match with the usage of integers representations of IPv4 address. It would be interesting to know what the usage scenario is for accessing as 32-bit words, and whether that whole section could be deprecated and eventually removed. In the meanwhile, removing the union seems good. |
It is undefined behavior, not unspecified. Any compiler can at any time assume that there is no undefined behavior and compile the program in such a way that a certain number of instructions (code) have been optimized to eliminate undefined behavior. That is, literally anything can happen - including all logic being cut out, since only in this case will it be possible to achieve the absence of undefined behavior - which the compiler will happily do. |
Rewrite IPAddress without union
Using a union allowed undefined behavior for 8-bit integer INPUT and OUTPUT arguments through a 32-bit integer (and vice versa). Write through
union::uint8_t[16]
and read throughunion::uint32_t[4]
is an undefined behavior.C++ standard says (https://timsong-cpp.github.io/cppwp/n4659/class.union#1):
and (https://timsong-cpp.github.io/cppwp/n4659/class.mem#22):
you can't hope that compilers don't seem to do undefined behavior at the moment
Also created espressif/arduino-esp32#8829 PR dependent on changing the current API.