Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion to unsigned integers #49

Open
serhiy-storchaka opened this issue Jun 28, 2024 · 3 comments
Open

Conversion to unsigned integers #49

serhiy-storchaka opened this issue Jun 28, 2024 · 3 comments

Comments

@serhiy-storchaka
Copy link

There are three main groups of conversion to C integer types:

  1. Conversion to a signed integer type with overflow checking
  2. Conversion to an unsigned integer type with overflow checking
  3. Conversion to an unsigned integer type without overflow checking

In conversion to a signed integer type, an OverflowError is raised if the value outside of the range of the signed integer type. It is OK, as the limits are usually an implementation detail.

In conversion to an unsigned integer type with overflow checking, an OverflowError is usually raised if the value outside of the range of the signed integer type. But in some cases a ValueError is raised for negative value, and OverflowError is raised for large value exceeding the implementation defined limit. I propose to always raise ValueError when negative value is converted to an unsigned integer type. This condition is usually not an implementation detail: a size cannot be negative, independently of a platform. I opened an issue about this in
2017, but @rhettinger was against this, so this froze. But a new PyLong_AsNativeBytes functions raises ValueError for negative values, as well as many private converters used in Argument Clinic, so I decided to revive this issue. Today arguments in favor of it are stronger.

The last group is the most dangerous. Silent ignoring an overflow can leads to bugs. Unfortunately this is the default behavior of PyArg_Parse format units for unsigned integers, and some of PyMemberDef types. There are use cases for this:

  • when we do not know whether the destination integer type is signed or unsigned
  • when the destination integer type is in general unsigned, but -1 or some other small negative values are special values. For example, uid_t or dev_t -- on some platforms they can have large positive values, but -1 is a special value.

To mitigate possible harm of such API, I propose to limit it too. For example, PyLong_AsUnsignedMask() should raise OverflowError if the value is larger than ULONG_MAX or less than some negative value (LONG_MIN or even smaller, like INT_MIN, or -256, or -1).

Here are my two ideas.

@serhiy-storchaka
Copy link
Author

See also https://discuss.python.org/t/conversion-between-python-and-c-integers/44117/2, I may repeat myself here.

@encukou
Copy link
Contributor

encukou commented Jul 1, 2024

Which functions are you proposing to change?

I think it's too late to change the types of exceptions raised by existing functions. It's not worth breaking existing code.

But let's make new API is better. And perhaps also add documentation warnings to API that would work differently if added now.

@serhiy-storchaka
Copy link
Author

  1. Raise ValueError for negative values: PyLong_AsUnsignedLong(), PyLong_AsUnsignedLongLong(), PyLong_AsSize_t() and the "b" format unit in PyArg_Parse(). See gh-74020: Raise ValueError for negative values converted to unsigned python/cpython#121114
  2. Raise a warning for truncated values: PyLong_AsUnsignedLongMask(), PyLong_AsUnsignedLongLongMask(), and format units "B", "H", "I", "k", "K" in PyArg_Parse().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants