Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mandatory error handling #5

Closed
encukou opened this issue Oct 9, 2023 · 19 comments
Closed

Mandatory error handling #5

encukou opened this issue Oct 9, 2023 · 19 comments
Labels
guideline To be included in guidelines PEP

Comments

@encukou
Copy link
Contributor

encukou commented Oct 9, 2023

All new functions must be able to signal failure.
Users should write error handles after every API call.

This is needed for run-time deprecation warnings: DeprecationWarning with -Wall raises. So, any function might raise in the future. (Deprecating infallible function, via compiler warning, requires users to recompile.)

For functions whose current implementation never raises, we should find ways to instruct the compiler to skip users' error handling code.

@zooba
Copy link
Contributor

zooba commented Oct 9, 2023

For functions whose current implementation never raises, we should find ways to instruct the compiler to skip users' error handling code.

This is not forward compatible. The best we could do is mark that a failure result is unlikely, which might influence branch prediction or code layout, but we can't have the compiler omit error handling completely. Otherwise when we add an error condition in the future, existing compiled modules will have no choice but to crash.

@encukou
Copy link
Contributor Author

encukou commented Oct 9, 2023

Yup, that would only be for extensions that need to be recompiled for every Python version. I expect there will still be a fair number of those, as you can get more performance that way.

@zooba
Copy link
Contributor

zooba commented Oct 9, 2023

In that case, #7 😄

@encukou encukou changed the title Error handling Mandatory error handling Oct 9, 2023
@encukou encukou added the guideline To be included in guidelines PEP label Oct 11, 2023
@vstinner
Copy link
Contributor

This is needed for run-time deprecation warnings: DeprecationWarning with -Wall raises. So, any function might raise in the future. (Deprecating infallible function, via compiler warning, requires users to recompile.)

So far, many functions were deprecated with Py_DEPRECATED() in the C API and documentation, and then just removed. It's not always possible to emit warnings at runtime.

Also, only a minority of functions will be deprecated one day, the majority will stay. I would prefer to not have to pay a tax "just in case" tomorrow we might consider to deprecate a function.

I'm fine with treating deprecating as build errors when the function is removed. It may be even better since runtime errors may not be catched by tests, but may only happen on production. The Macro to hide deprecated functions proposes a solution for that which doesn't require to check every single call for any C API function when writing a C extension.

Well, in general, functions can already fail for multiple others reasons, so there is no need to "enforce" callers to check for errors, it's already part of the obvious API.

I created the opposite of this issue: Functions which cannot fail. IMO we need exceptions for some specific functions which cannot fail. Otherwise, the API will be way less pleasant to use.

@encukou
Copy link
Contributor Author

encukou commented Nov 20, 2023

So far, many functions were deprecated with Py_DEPRECATED() in the C API and documentation, and then just removed. It's not always possible to emit warnings at runtime.

Sure, but we haven't removed anything from the limited API yet.
As limited API is used more, we'll need a way to deprecate things from it -- and IMO, runtime warnings and runtime failures are that way.

Also, only a minority of functions will be deprecated one day, the majority will stay. I would prefer to not have to pay a tax "just in case" tomorrow we might consider to deprecate a function.

A minority of functions can not fail, just like in a minority of functions it's safe to return a borrowed reference.
This proposal, like “no borrowed references” or “-1/NULL always mean an exception was raised”, is meant to keep the API consistent. Error checking should be predictable to write and easy to review: you call a Python API, you check the result for -1 or NULL.
Like with the other guidelines, some API could be faster with error checking omitted, and the speedup could be significant for some use cases. In those cases, I propose to handle it like other such exceptions to the guidelines -- keep the consistent API, and add a faster variant (#4).

@pitrou
Copy link

pitrou commented Nov 22, 2023

Before mandating that all C API functions should be able to return an error, you should think about developer ergonomics.

What does it mean to have to check errors for every possible CPython C API function call? Code using the C API will be even more verbose than it is today. While it's true that more and more users will rely on higher-level wrappers such as Cython or nanobind, this shouldn't be a reason to make the C API harder to use.

As for the idea that deprecated C functions should emit a runtime warning, that's not how deprecation usually works in C/C++. It will also be funny to get a deprecation warning without the corresponding source code displayed in the traceback (because tracebacks don't include C functions).

Overall, API design should remain pragmatic, not dogmatic. There is no reason to check for errors after PyTuple_Check, for example. It just makes code more annoying to write for exactly zero benefit.

If you're not convinced, look at modern APIs in other languages with explicit error handling (the Rust stdlib, for example). You'll see that errors are returned only where semantically required, not as a provision for "errors that we might want to return in the future".

@encukou
Copy link
Contributor Author

encukou commented Dec 4, 2023

That makes sense, but it's hard to draw a consistent line.

Or rather, it's quite easy now! All exceptions should be approved by the C API WG. We can collect the exceptions to the rule, and reorganize when keeping the list of exceptions becomes unwieldy.
Initial blanket exceptions could be:

  • Type-checking functions, like PyTuple_Check.
  • Basic refcounting operations, like Py_DECREF and Py_INCREF.
  • Operations on native types that cannot have exceptional cases (e.g. overflow), like Py_HashDouble.

While it's true that more and more users will rely on higher-level wrappers such as Cython or nanobind, this shouldn't be a reason to make the C API harder to use.

“Harder to use” is relative: is it more typing, or more details to remember?

So here's another reason for this guideline: making the API more regular, so that a reviewer doesn't need to open the docs for every function a piece of code calls. Ideally there'd be a single answer for questions like “Should there be error handling?” and “What value signals an error?”.

So, IMO, -1 and NULL should still be reserved; the above exceptional API will just never return them. Users -- and especially automated API wrappers -- may choose to include error-checking for all modern API calls.

If you're not convinced, look at modern APIs in other languages with explicit error handling (the Rust stdlib, for example).

They make it very hard to omit error checking when it's needed. That makes for a very different story.

@pitrou
Copy link

pitrou commented Dec 4, 2023

Or rather, it's quite easy now! All exceptions should be approved by the C API WG. We can collect the exceptions to the rule, and reorganize when keeping the list of exceptions becomes unwieldy. Initial blanket exceptions could be: [...]

That sounds reasonable to me.

“Harder to use” is relative: is it more typing, or more details to remember? [...]

Good point :-)

So, IMO, -1 and NULL should still be reserved; the above exceptional API will just never return them. Users -- and especially automated API wrappers -- may choose to include error-checking for all modern API calls.

I agree with reserving -1 and NULL. There is one exception that has to be made: for functions like PyLong_AsLong where -1 is, after all, a legitimate error value :-)

If you're not convinced, look at modern APIs in other languages with explicit error handling (the Rust stdlib, for example).

They make it very hard to omit error checking when it's needed. That makes for a very different story.

Admittedly... I was going to say that you can use warn_unused_result on select compilers, but that doesn't ensure that the caller checks for a possible error if we're talking about e.g. a tri-state error return.

@encukou
Copy link
Contributor Author

encukou commented Dec 4, 2023

I agree with reserving -1 and NULL. There is one exception that has to be made: for functions like PyLong_AsLong where -1 is, after all, a legitimate error value :-)

We don't like that API, see problem number one (and especially Mark's comment).

If they were added today, they should take output arguments (long *result pointers to memory they fill up), and leave the return value to signal errors.

@vstinner
Copy link
Contributor

vstinner commented Dec 5, 2023

@encukou:

We don't like that API, see capi-workgroup/problems#1 (and especially capi-workgroup/problems#1 (comment)).

I suppose that Antoine means here that some existing API can return -1 which is a valid value, and doesn't report an error.

@encukou:

“Harder to use” is relative: is it more typing, or more details to remember?

For me, it means "having to write more lines of code". Example with a function which requires to check for errors (worst case, PyLong API):

-        i = PyLong_AS_LONG(v);
+        Py_ssize_t i = PyLong_AsSsize_t(v);
+        if (i == -1 && PyErr_Occurred()) {
+            Py_DECREF(tuple);
+            return NULL;
+        }

1 line vs 5 lines. Here, the code is correct, error must be checked. But it's just to show how an API which requires to check for error requires 5x more lines of code for a single function call.

@encukou
Copy link
Contributor Author

encukou commented Dec 5, 2023

Time for a controversial opinion!
When I'm free to ignore PEP 7, I'd write something along the lines of:

-        i = PyLong_AS_LONG(v);
+        Py_ssize_t i = PyLong_AsSsize_t(v);
+        if (i == -1 && PyErr_Occurred()) goto error;

I find the one-line if perfectly readable for error handling with a single goto or return.
If there are any other error handling in the function, and thus error: already exists, error handling is one line of code. (And often I'd write finally: rather than error:, to XDECREF locals even in the success case).

@vstinner
Copy link
Contributor

vstinner commented Dec 5, 2023

you can use warn_unused_result on select compilers

That sounds like a good step forward. Maybe we can start by adding the attribute to functions added to Python 3.13 which can fail.

@pitrou
Copy link

pitrou commented Dec 5, 2023

That sounds like a good step forward. Maybe we can start by adding the attribute to functions added to Python 3.13 which can fail.

Or perhaps add it to all functions returning an error and see how many projects break :-)
After all, this issue is titled "mandatory error handling"...

@pitrou
Copy link

pitrou commented Dec 5, 2023

By the way it seems like MSVC has sophisticated annotations to let the compiler understand a function call's semantics:
https://learn.microsoft.com/fr-fr/previous-versions/visualstudio/visual-studio-2015/code-quality/understanding-sal?view=vs-2015

@zooba

@encukou
Copy link
Contributor Author

encukou commented Dec 5, 2023

After all, this issue is titled "mandatory error handling"...

The guideline issues in this repo are draft guidleines for new API.

By the way it seems like MSVC has sophisticated annotations

Oh, that's nice!
For NI, I was thinking about something along those lines, along with e.g. distinguishing a zero-terminated UTF-8 string vs. a raw blobs of memory. No well-baked ideas yet, though.

@zooba
Copy link
Contributor

zooba commented Dec 5, 2023

it seems like MSVC has sophisticated annotations to let the compiler understand a function call's semantics

Yeah, and they're fantastic. But not portable and very invasive in definitions, so I've always just assumed that until gcc ended up with something similar (i.e. we can use the same macros with different definitions) it'd never go anywhere.

@pitrou
Copy link

pitrou commented Dec 5, 2023

Well, you'd certainly end up writing abstraction macros. gcc would never use the same syntax as MSVC, would it? :-)

@zooba
Copy link
Contributor

zooba commented Dec 5, 2023

Of course there'd be macros, but there'd also be significant pushback about macros on literally every parameter that don't do anything on gcc. I'm not going to fight that battle.

@encukou
Copy link
Contributor Author

encukou commented Nov 29, 2024

Added to the draft in #53, including the blanket exceptions proposed in #5 (comment), and reserving NULL and -1 to signal exceptions (i.e. infallible functions should not return these values).

@encukou encukou closed this as completed Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guideline To be included in guidelines PEP
Projects
None yet
Development

No branches or pull requests

4 participants