-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rule Suggestion min
/max
instead of sorted(l)[0]
#10463
Comments
This seems nice for my first rule contribution. It occurs way more than I was expecting from this GitHub search, though some of these are not actual cases of this violation.
Not from a quick search in their docs. @charliermarsh do you think this is something I could tackle? Any guidance would be nice but not necessary until I hit roadblocks. Performance impact of fixing this violationTest script import timeit
import random
def method_sorted(a):
return sorted(a)[0]
def method_min(a):
return min(a)
sizes = [10**i for i in range(1, 5)] # Sizes: 10, 100, 1000, ...
number_of_executions = 1000
for size in sizes:
a = [random.randint(1, 1000) for _ in range(size)]
time_sorted = timeit.timeit(lambda: method_sorted(a), number=number_of_executions)
time_min = timeit.timeit(lambda: method_min(a), number=number_of_executions)
speedup = time_sorted / time_min
print(f"Size: {size}, sorted(a)[0] time: {time_sorted:.5f}s, min(a) time: {time_min:.5f}s, Speedup: {speedup:.2f}x") On my M1 pro: Size: 10, sorted(a)[0] time: 0.00024s, min(a) time: 0.00021s, Speedup: 1.17x
Size: 100, sorted(a)[0] time: 0.00279s, min(a) time: 0.00118s, Speedup: 2.36x
Size: 1000, sorted(a)[0] time: 0.05002s, min(a) time: 0.01093s, Speedup: 4.58x
Size: 10000, sorted(a)[0] time: 0.88557s, min(a) time: 0.10932s, Speedup: 8.10x Results are less impressive, but still good, with Size: 10, sorted(a)[0] time: 0.00067s, min(a) time: 0.00065s, Speedup: 1.03x
Size: 100, sorted(a)[0] time: 0.00629s, min(a) time: 0.00433s, Speedup: 1.45x
Size: 1000, sorted(a)[0] time: 0.08372s, min(a) time: 0.04047s, Speedup: 2.07x
Size: 10000, sorted(a)[0] time: 1.23920s, min(a) time: 0.40743s, Speedup: 3.04x |
sorted(l)[0]
min
/max
instead of sorted(l)[0]
Just implemented this as FURB192 in Refurb: dosisod/refurb#333 |
Python from operator import itemgetter
data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)]
min(data, key=itemgetter(0)) # ('blue', 1)
sorted(data, key=itemgetter(0))[0] # ('blue', 1)
sorted(data, key=itemgetter(0), reverse=True)[-1] # ('blue, 2') In this case, the order of |
I would be okay documenting that as a caveat of the rule, seems like a weird edge-case. |
We should probably flag those unstable sorts as unsafe fixes if we go through with them. |
How can we identify them as unstable? |
We should test it out using the example provided above but I think only the '[-1]' index pattern is unstable. |
Actually to clarify, all these sorts are stable, but reverse=True reverse the order unstable elements apparently. (And reverse=True, '[-1]'). So only the first proposed fix using min (and it's reversed version) will always be guaranteed to be the same. I suppose. Maybe that warrants its own rule in that case or a rule toggle not flag it when it could change the returned value on ties. |
I'd like to start working on this if nobody has started yet :) |
) ## Summary Fixes #10463 Add `FURB192` which detects violations like this: ```python # Bad a = sorted(l)[0] # Good a = min(l) ``` There is a caveat that @Skylion007 has pointed out, which is that violations with `reverse=True` technically aren't compatible with this change, in the edge case where the unstable behavior is intended. For example: ```python from operator import itemgetter data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)] min(data, key=itemgetter(0)) # ('blue', 1) sorted(data, key=itemgetter(0))[0] # ('blue', 1) sorted(data, key=itemgetter(0), reverse=True)[-1] # ('blue, 2') ``` This seems like a rare edge case, but I can make the `reverse=True` fixes unsafe if that's best. ## Test Plan This is unit tested. ## References https://github.com/dosisod/refurb/pull/333/files --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Saw this in code:
should be
Additionally, the following code pattern can also be simplified
should be
Is there already a pylint rule for this? I wouldn't be surprised. If not, let's add it as a RUFF rule.
I am not sure about the
[-1]
variants, as there may be some subtlety there with sort stability?The text was updated successfully, but these errors were encountered: