-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-125997: ensure that time.sleep(0)
is not delayed on non-Windows platforms
#128274
Conversation
ff959e1
to
80de853
Compare
@picnixz Unfortunately, we should seriously review :(. It seems that an assertion failed somewhere.
|
Ah I think I know what happens. When we raise an OSError from errno we don't clear the errno. So if you chain calls, you need to just ignore them. I'm removing the assertion cc @ZeroIntensity |
Due to how `OSError` are raised from `errno`, we do not clear `errno` afterwards. If we catch `OSError`, then we still have an errno set, and if we call `time.sleep()` just after, we may have `errno != 0` (but we know we handled it so it's fine).
If pytime is running on multiple cores, I recommend using raw defined
|
This wouldn't solve the issue (also because |
TIL! Should we assert |
Not sure. It's not really an issue and the function is internal and only for |
I think we should. There's no real cost to adding it, and it would be helpful for debugging in the rare case that something goes wrong. |
i have not looked into this too deeply but my understanding is the poll() will also relinquish the CPU since it is a syscall but not necessarily give up the time slice (e.g. if there are no threads with higher priority)
this looks like a good idea, might be the cleanest - just choose TIMER_ABSTIME in the case that the argument to time.sleep() is <=0. as a user, i would just generally expect that sleep() issues a syscall (i.e. relinquish enough control to the kernel and cpython runtime that they can unblock any I/O or other operations that need to be scheduled) but don't really care exactly which one. |
I'll commit something tomorrow or the day after using that approach. It would also be nice to avoid the (small) overhead of using |
While we're still above the For comparison, the current implementation performances 25x worse: I'll update the comments, docs and tests and push those changes. FTR, the numbers above are in a DEBUG build, but this doesn't change much as on an optimized build, current implementation takes roughly the same amount of time. |
Doc/whatsnew/3.14.rst
Outdated
time | ||
---- | ||
|
||
* Ensure that :func:`time.sleep(0) <time.sleep>` does not degrade over time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that you're referring to performance:
* Ensure that :func:`time.sleep(0) <time.sleep>` does not degrade over time | |
* Ensure that :func:`time.sleep(0) <time.sleep>` performance does not degrade over time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the resolution of this PR (namely, whether we eventually use select()
), I'll amend the NEWS so I won't commit that suggestion for now.
Modules/timemodule.c
Outdated
err = ret; | ||
#elif defined(HAVE_NANOSLEEP) | ||
struct timespec zero = {0, 0}; | ||
ret = nanosleep(&zero, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth it to have 3 code paths for sleep(0)? Can't we always use select()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a long discussion on this matter but long story short:
select()
is implementation sensitive but there were some semantical and implementation concerns (see gh-125997: ensure thattime.sleep(0)
is not delayed on non-Windows platforms #128274 (comment)).- using
select()
would be much faster and if you still want to go that road (which I did originally), then I personally don't mind.
The implementations of `time.sleep(0)` on Windows and POSIX are now handled by the same function. Previously, `time.sleep(0)` on Windows was handled by `pysleep()` while on POSIX platforms, it was handled by `pysleep_zero_posix()`.
Modules/timemodule.c
Outdated
Py_BEGIN_ALLOW_THREADS | ||
#ifdef HAVE_CLOCK_NANOSLEEP | ||
struct timespec zero = {0, 0}; | ||
ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &zero, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that TIMER_ABSTIME is appropriate here:
ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &zero, NULL); | |
ret = clock_nanosleep(CLOCK_MONOTONIC, 0, &zero, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first thought about:
If flags is TIMER_ABSTIME, then t is interpreted as an absolute
time as measured by the clock, clockid. If t is less than or
equal to the current value of the clock, then clock_nanosleep()
returns immediately without suspending the calling thread.
Without this flag, time.sleep(0)
takes 50us. Otherwise it takes 2us. So, now I'm more and more inclined to actually revert it back to a select
. Because otherwise, it's as if I'm not doing anything (and just skip the call to clock_nanosleep
; I mean I already know that the condition on t
is verified so it's as if I have no call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving for a few days but I'm still struggling to convince myself between the use of select
or not. The reason why clock_nanosleep()
is slowed down is because we don't pass a zero time struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was wrong. See #128274 (comment).
Ok, so I'm actually back at the starting point. First of all, let's take a step back and understand the issue at hand: Second, I was wrong with Now, let's move back to what template<typename _Rep, typename _Period>
inline void
sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
{
if (__rtime <= __rtime.zero())
return;
...
} So, as we can see, If yes, what about Windows' dedicate Or, if this is really too much of a hassle, we can just... drop that PR (I won't mind) and simply improve the docs saying that |
Using `clock_nanosleep()` would always take more than 50 us.
For now, I've decided to revert to the old 3.11 behaviour. If you prefer me to change the docs, I can also do it. I'm leaving for a few days so I won't be able to see the comments. |
If I understood correctly the issue, the purpose of this change is to make The problem is that always using |
Considering that we would be penalizing FreeBSD (and that using I learned a lot during this but I'm still not comfortable with reverting it back to the previous behaviour. ISTM that the previous behaviour is probably "wrong" (namely, I'm not really sure we should even use |
This is a suggestion for fixing the issue when the timeout is 0. For other timeouts, this does not change the current behaviour (I'm tempted removing the
do-while
loop as we should never have an issue, but I'm not familar with the monotonic C implementation so I left it as is).time.sleep(0)
is slower on Python 3.11 than on Python 3.10 #125997