Conversion to unsigned integers #49

serhiy-storchaka · 2024-06-28T11:11:34Z

There are three main groups of conversion to C integer types:

Conversion to a signed integer type with overflow checking
Conversion to an unsigned integer type with overflow checking
Conversion to an unsigned integer type without overflow checking

In conversion to a signed integer type, an OverflowError is raised if the value outside of the range of the signed integer type. It is OK, as the limits are usually an implementation detail.

In conversion to an unsigned integer type with overflow checking, an OverflowError is usually raised if the value outside of the range of the signed integer type. But in some cases a ValueError is raised for negative value, and OverflowError is raised for large value exceeding the implementation defined limit. I propose to always raise ValueError when negative value is converted to an unsigned integer type. This condition is usually not an implementation detail: a size cannot be negative, independently of a platform. I opened an issue about this in
2017, but @rhettinger was against this, so this froze. But a new PyLong_AsNativeBytes functions raises ValueError for negative values, as well as many private converters used in Argument Clinic, so I decided to revive this issue. Today arguments in favor of it are stronger.

The last group is the most dangerous. Silent ignoring an overflow can leads to bugs. Unfortunately this is the default behavior of PyArg_Parse format units for unsigned integers, and some of PyMemberDef types. There are use cases for this:

when we do not know whether the destination integer type is signed or unsigned
when the destination integer type is in general unsigned, but -1 or some other small negative values are special values. For example, uid_t or dev_t -- on some platforms they can have large positive values, but -1 is a special value.

To mitigate possible harm of such API, I propose to limit it too. For example, PyLong_AsUnsignedMask() should raise OverflowError if the value is larger than ULONG_MAX or less than some negative value (LONG_MIN or even smaller, like INT_MIN, or -256, or -1).

Here are my two ideas.

The text was updated successfully, but these errors were encountered:

serhiy-storchaka · 2024-06-28T11:12:04Z

See also https://discuss.python.org/t/conversion-between-python-and-c-integers/44117/2, I may repeat myself here.

encukou · 2024-07-01T13:37:00Z

Which functions are you proposing to change?

I think it's too late to change the types of exceptions raised by existing functions. It's not worth breaking existing code.

But let's make new API is better. And perhaps also add documentation warnings to API that would work differently if added now.

serhiy-storchaka · 2024-07-01T13:54:46Z

Raise ValueError for negative values: PyLong_AsUnsignedLong(), PyLong_AsUnsignedLongLong(), PyLong_AsSize_t() and the "b" format unit in PyArg_Parse(). See gh-74020: Raise ValueError for negative values converted to unsigned python/cpython#121114
Raise a warning for truncated values: PyLong_AsUnsignedLongMask(), PyLong_AsUnsignedLongLongMask(), and format units "B", "H", "I", "k", "K" in PyArg_Parse().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversion to unsigned integers #49

Conversion to unsigned integers #49

serhiy-storchaka commented Jun 28, 2024

serhiy-storchaka commented Jun 28, 2024

encukou commented Jul 1, 2024

serhiy-storchaka commented Jul 1, 2024

Conversion to unsigned integers #49

Conversion to unsigned integers #49

Comments

serhiy-storchaka commented Jun 28, 2024

serhiy-storchaka commented Jun 28, 2024

encukou commented Jul 1, 2024

serhiy-storchaka commented Jul 1, 2024