-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C API: Add internal C API to build a tuple: _PyTupleBuilder #107137
Comments
Add _PyTupleBuilder structure and functions: * _PyTupleBuilder_Init() * _PyTupleBuilder_Alloc() * _PyTupleBuilder_Append() * _PyTupleBuilder_AppendUnsafe() * _PyTupleBuilder_Finish() * _PyTupleBuilder_Dealloc() The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls.
Add _PyTupleBuilder structure and functions: * _PyTupleBuilder_Init() * _PyTupleBuilder_Alloc() * _PyTupleBuilder_Append() * _PyTupleBuilder_AppendUnsafe() * _PyTupleBuilder_Finish() * _PyTupleBuilder_Dealloc() The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls. Use _PyTupleBuilder API in itertools batched_traverse(), PySequence_Tuple() and initialize_structseq_dict(). Add also helper functions: * _PyTuple_ResizeNoTrack() * _PyTuple_NewNoTrack()
Add _PyTupleBuilder structure and functions: * _PyTupleBuilder_Init() * _PyTupleBuilder_Alloc() * _PyTupleBuilder_Append() * _PyTupleBuilder_AppendUnsafe() * _PyTupleBuilder_Finish() * _PyTupleBuilder_Dealloc() The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. Allocate an array of 16 objects on the stack to avoid allocating small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls. Do no track the temporary internal tuple by the GC before _PyTupleBuilder_Finish() creates the final complete and consistent tuple object. Use _PyTupleBuilder API in itertools batched_traverse(), PySequence_Tuple() and initialize_structseq_dict(). Add also helper functions: * _PyTuple_ResizeNoTrack() * _PyTuple_NewNoTrack()
This API is for creating tuples when the final size is not known in advance. For known size, use existing APIs like PyTuple_Pack() or Py_BuildValue(). |
This API is a fix for this old issue: #59313 "Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash" which was closed as "not a bug" in 2021 by @pablogsal. |
Add _PyTupleBuilder structure and functions: * _PyTupleBuilder_Init() * _PyTupleBuilder_Alloc() * _PyTupleBuilder_Append() * _PyTupleBuilder_AppendUnsafe() * _PyTupleBuilder_Finish() * _PyTupleBuilder_Dealloc() The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. Allocate an array of 16 objects on the stack to avoid allocating small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls. Do no track the temporary internal tuple by the GC before _PyTupleBuilder_Finish() creates the final complete and consistent tuple object. Use _PyTupleBuilder API in itertools batched_traverse(), PySequence_Tuple() and initialize_structseq_dict(). Add also helper functions: * _PyTuple_ResizeNoTrack() * _PyTuple_NewNoTrack()
Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC. Modify PySequence_Tuple() to use these functions. It prevents leaking the tuple which is being initialized in the GC, via functions like gc.get_referrers(). Previously, accessing to such partially initialized tuple could crash Python. Now PySequence_Tuple() only tracks the tuple once it's fully initialized.
Fix a crash in PySequence_Tuple() when the tuple which is being initialized was accessed via GC introspection, like the gc.get_referrers() function. Now the function only tracks the tuple by the GC once it's fully initialized. Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC. Modify PySequence_Tuple() to use these functions.
Fix a crash in PySequence_Tuple() when the tuple which is being initialized was accessed via GC introspection, like the gc.get_referrers() function. Now the function only tracks the tuple by the GC once it's fully initialized. Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC. Modify PySequence_Tuple() to use these functions.
Fix a crash in PySequence_Tuple() when the tuple which is being initialized was accessed via GC introspection, like the gc.get_referrers() function. Now the function only tracks the tuple by the GC once it's fully initialized. Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC. Modify PySequence_Tuple() to use these functions.
What's the benefit of this over building a list then calling |
You avoid creating a temporary list and then destroy it. It may be faster. |
How is creating a temporary tuple builder better than creating a temporary list? If building a list is slow, we should fix that. It would be far more beneficial than adding a new API. |
Code using PyList_AsTuple() requires to creates a temporary list object and then create the final tuple object: two objects. My proposed API only creates a single Python object: less memory allocations, no conversion of list items to tuple items (copy + INCREF).
How do you plan to optimize list creation? |
FTR, we can fix the sqlite3 issue by using the internal |
Hm, |
Here's a quick and non-scientific benchmark: $ ./python.exe -m timeit "buildtuple(10)"
2000000 loops, best of 5: 140 nsec per loop
$ ./python.exe -m timeit "buildtuple(100)"
500000 loops, best of 5: 864 nsec per loop
$ ./python.exe -m timeit "buildtuple(1000)"
5000 loops, best of 5: 57.8 usec per loop
$ ./python.exe -m timeit "buildtuple(10000)"
500 loops, best of 5: 715 usec per loop
$ ./python.exe -m timeit "buildlist(10)"
1000000 loops, best of 5: 246 nsec per loop
$ ./python.exe -m timeit "buildlist(100)"
200000 loops, best of 5: 1.16 usec per loop
$ ./python.exe -m timeit "buildlist(1000)"
5000 loops, best of 5: 59.6 usec per loop
$ ./python.exe -m timeit "buildlist(10000)"
500 loops, best of 5: 764 usec per loop Codediff --git a/Python/bltinmodule.c b/Python/bltinmodule.c
index 787f53fae3..638556c8e5 100644
--- a/Python/bltinmodule.c
+++ b/Python/bltinmodule.c
@@ -2745,6 +2745,66 @@ builtin_issubclass_impl(PyObject *module, PyObject *cls,
return PyBool_FromLong(retval);
}
+
+/*[clinic input]
+buildtuple as builtin_buildtuple
+ n: int
+ /
+[clinic start generated code]*/
+
+static PyObject *
+builtin_buildtuple_impl(PyObject *module, int n)
+/*[clinic end generated code: output=eaed9c56019027a6 input=1c9950e2c501d16a]*/
+{
+ PyObject *tuple = PyTuple_New(n);
+ if (tuple == NULL) {
+ goto error;
+ }
+ for (int i = 0; i < n; i++) {
+ PyObject *item = PyLong_FromLong(i);
+ if (item == NULL) {
+ goto error;
+ }
+ PyTuple_SET_ITEM(tuple, i, item);
+ }
+ return tuple;
+
+error:
+ Py_XDECREF(tuple);
+ return NULL;
+}
+
+/*[clinic input]
+buildlist as builtin_buildlist = buildtuple
+[clinic start generated code]*/
+
+static PyObject *
+builtin_buildlist_impl(PyObject *module, int n)
+/*[clinic end generated code: output=227ae994f7278b23 input=f8333bb7e2c7f2ce]*/
+{
+ PyObject *list = PyList_New(n);
+ if (list == NULL) {
+ goto error;
+ }
+ for (int i = 0; i < n; i++) {
+ PyObject *item = PyLong_FromLong(i);
+ if (item == NULL) {
+ goto error;
+ }
+ PyList_SET_ITEM(list, i, item);
+ }
+ PyObject *tuple = PyList_AsTuple(list);
+ if (tuple == NULL) {
+ goto error;
+ }
+ Py_DECREF(list);
+ return tuple;
+
+error:
+ Py_XDECREF(list);
+ return NULL;
+}
+
typedef struct {
PyObject_HEAD
Py_ssize_t tuplesize;
@@ -3060,6 +3120,8 @@ static PyMethodDef builtin_methods[] = {
BUILTIN_SETATTR_METHODDEF
BUILTIN_SORTED_METHODDEF
BUILTIN_SUM_METHODDEF
+ BUILTIN_BUILDTUPLE_METHODDEF
+ BUILTIN_BUILDLIST_METHODDEF
{"vars", builtin_vars, METH_VARARGS, vars_doc},
{NULL, NULL},
};
diff --git a/Python/clinic/bltinmodule.c.h b/Python/clinic/bltinmodule.c.h
index 7540de6828..9da527ed3a 100644
--- a/Python/clinic/bltinmodule.c.h
+++ b/Python/clinic/bltinmodule.c.h
@@ -1212,4 +1212,58 @@ builtin_issubclass(PyObject *module, PyObject *const *args, Py_ssize_t nargs)
exit:
return return_value;
}
-/*[clinic end generated code: output=daeee81b018824f4 input=a9049054013a1b77]*/
+
+PyDoc_STRVAR(builtin_buildtuple__doc__,
+"buildtuple($module, n, /)\n"
+"--\n"
+"\n");
+
+#define BUILTIN_BUILDTUPLE_METHODDEF \
+ {"buildtuple", (PyCFunction)builtin_buildtuple, METH_O, builtin_buildtuple__doc__},
+
+static PyObject *
+builtin_buildtuple_impl(PyObject *module, int n);
+
+static PyObject *
+builtin_buildtuple(PyObject *module, PyObject *arg)
+{
+ PyObject *return_value = NULL;
+ int n;
+
+ n = _PyLong_AsInt(arg);
+ if (n == -1 && PyErr_Occurred()) {
+ goto exit;
+ }
+ return_value = builtin_buildtuple_impl(module, n);
+
+exit:
+ return return_value;
+}
+
+PyDoc_STRVAR(builtin_buildlist__doc__,
+"buildlist($module, n, /)\n"
+"--\n"
+"\n");
+
+#define BUILTIN_BUILDLIST_METHODDEF \
+ {"buildlist", (PyCFunction)builtin_buildlist, METH_O, builtin_buildlist__doc__},
+
+static PyObject *
+builtin_buildlist_impl(PyObject *module, int n);
+
+static PyObject *
+builtin_buildlist(PyObject *module, PyObject *arg)
+{
+ PyObject *return_value = NULL;
+ int n;
+
+ n = _PyLong_AsInt(arg);
+ if (n == -1 && PyErr_Occurred()) {
+ goto exit;
+ }
+ return_value = builtin_buildlist_impl(module, n);
+
+exit:
+ return return_value;
+}
+/*[clinic end generated code: output=a3a6068cbca21e95 input=a9049054013a1b77]*/ |
I close the issue, see: #107139 (comment) |
The current Python C API PyTuple_New() to build a tuple has multiple flaws:
I propose adding a new internal C API to build a tuple: _PyTupleBuilder.
Later, once we will consider this API stable, we can add it to the public C API and deprecate the old API to create an incomplete tuple:
See capi-workgroup/problems#56 for details on problems caused by API which let creating incomplete/inconsistent objects.
Linked PRs
The text was updated successfully, but these errors were encountered: